UNIVERSAL 


>" 

0 £ 

< 

q: 

eg 

_i 



OU 164086 > 

— 73 


UNIVERSAL 




OSMANIA UNIVERSITY LIBRARY 




AN INTRODUCTION TO 


Statistical Analysis 

C. H. RICHARDSON ph.d. 

Professor of Mathematics , Buckncll University 


UK VIS ED EDITION 


HARCOURT, BRACK AND COMPANY 


N K W YORK 



COPYRIGHT, 1934, 1935, 1944, BY 
HARCOURT, BRACE AND COMPANY, INC. 


All righto reserved. No part of this book may be reproduced 
in any form, by mimeograph or any other means, without per- 
mission in writing from the publisher. 

[f • IO • 49 ] 


PBINTED IN THE UNITED STATES OF AU8BIOA 



PREFACE 


It is the aim of this book to present the fundamental notions 
of statistical analysis in such a manner that they can be compre- 
hended by students who have had but little training, in mathematics 
and yet in such a way that they can be studied to advantage even 
by those who have had considerable mathematics. To supplement 
the mathematical preparation of the former group we have inter- 
mittently interrupted the continuity of the statistical procedure by 
inserting sections on certain topics of advanced algebra and analytic 
geometry such as sums and summations, some properties of the 
straight line, permutations, combinations, and the elementary theory 
of probability. 

Many of the basic notions of statistical analysis are expressed by 
formulas, the derivations of which have been assumed — altogether 
too frequently — to be hidden in a maze of higher mathematics. 
For a number of years we have encountered a growing opihion in 
some circles — betrayed by clever innuendo and subtle insinuation 
when not definitely expressed — that how to use a formula and what 
it means are the primary desiderata in statistical analysis and that 
how it is derived and what are its limitations are of secondary im- 
portance. It is our conviction that a reader will not comprehend 
fully what a formula means and what are its limitations unless he 
knows whence it comes and what are the assumptions underlying 
its development. 

Since the mathematical attainments necessary for an understand- 
ing of the development of many of out basic formulas include no 
more than a knowledge of algebra through the binomial theorem, 
the theory and use of logarithms, and the progressions — topics that 
are included in a well organized course of secondary algebra — we 
have included many derivations that come within the grasp of the 
ordinary student. The limited preparation in mathematics that we 
assume on the part of our readers requires that difficult derivations 
be generally omitted. 

iii 



iv 


PREFACE 


While the theory of statistical analysis is not easy, yet the diffi- 
culties are, in the main, due to the newness rather than to the ab- 
struseness of the notions encountered. The concepts will, therefore, 
become more meaningful and less terrifying if the student will be 
required to solve many of the numerous exercises that have been 
provided in the text. 

Statistical analysis boils down ultimately to numerical results: 
the methods and processes used in obtaining them and the methods 
and means for estimating their reliability. The earlier chapters of 
the book are concerned mainly with the methods, processes, and forms 
used in obtaining numerical results and the later chapters deal with 
estimating their reliability. 

The plan used in the development of the text may be briefly de- 
scribed as follows: Each topic is introduced with a brief statement of 
“what it is all about.” Then follows a brief statement of the under- 
lying theory of the topic under consideration which leads directly 
and simply to a development of the necessary formulas and processes. 
The reader is then shown how to use the formulas and processes to 
obtain the desired numerical results. Finally, the limitations of 
the formulas and processes and the significance and the reliability 
of the computed results are given due emphasis. Thus a student 
learns why a formula is applied, whence it is derived, how it is used and 
what are its limitations; he learns not only how to obtain the numeri- 
cal results but also how to measure their reliability. 

The method of treatment of all the topics is decidedly elementary. 
The graphical method has been widely employed and the explana- 
tions have been purposely detailed in order that the book may be 
more readily understood. Since the book undertakes to develop 
skills in deriving statistical results as well as to assist in understanding 
their significance, numerous exercises have been placed at strategic 
points in the text. This feature of solving exercises after a major 
topic has been considered adds to the teachableness of the subject, 
facilitates an understanding of the principles, and aids the student in 
acquiring the useful skills for statistical computations and inter- 
pretation. 

In general, the exercises are based upon actual rather than imagi- 
nary data in order that the study may proceed, if possible, with real 
life situations. The alert teacher can improvise “homemade” 



PREFACE 


v 


exercises as he needs them. Throughout the text, it is supposed that 
a computing machine is at the disposal of the student; nevertheless 
many of the exercises can be done satisfactorily with a slide rule or 
a table of logarithms, powers, and roots. 

No attempt has been made to make this text an exhaustive treatise 
on statistical analysis. Many topics, such as multiple correlation 
and frequency curves, have been studiously omitted. We have tried 
to keep in mind that we are writing an Introduction that would 
include the minimum essentials, at the same time hoping that this 
Introduction might inspire the reader to continue his study into the 
more advanced fields. 

I wish at this time to renew my thanks to Professor James W. 
Glover and Professor Harry C. Carver of the University of Michigan 
for their most generous aid to me when I was under their instruction. 

I also hasten to express my gratitude to Professor C. H. Forsyth of 
Dartmouth College and Professor Ralph W. Tyler of Ohio State Uni- 
versity, who critically read the manuscript and made numerous 
helpful suggestions. 

For any errors, I alone am responsible. Although the text has been 
checked painstakingly, it is not to be hoped that a publication of this 
character will appear without some errors creeping in. For the 
notification of such errors I shall be most grateful. 

PREFACE TO THE ENLARGED EDITION 

It has been an unexpected gratification to the author and the 
publishers alike to find that an enlarged edition of the book is called 
for so soon after its publication. The only criticism of consequence 
that has been made of the book was due to our omission of Index 
Numbers. After sampling the opinion of many teachers, it was felt 
desirable to add a chapter devoted to that topic. The opportunity 
has been taken to recast certain paragraphs, and such errors as were 
noted have been corrected. To all who have assisted me with their 
suggestions or by directing my attention to errors, I wish to express 
my sincere gratitude. 

C. H. Richardson 

Lewisburg, Pennsylvania 
April 30, 1935 




PREFACE TO THE REVISED EDITION 


About ten years have elapsed since the first edition of this book 
was published and twelve to fifteen years since the material for the 
first edition was collected and prepared. During this time a tre- 
mendous appreciation of and respect for statistical techniques have 
developed. A considerable extension of the use of statistical tech- 
niques in business, in public administration, and in the social sciences 
is very much in evidence. Research workers in biology, in education, 
in psychology, in sociology, in agriculture, lean more heavily on 
statistical techniques than ever before. And, with the passing of 
time, there has come a demand for more than primer notions: a 
deeper understanding of basic ideas is mandatory. For example, it 
no longer suffices merely to compute a statistical constant or statistic: 
one must evaluate it, determine its worth. 

Notable gains have been made during the past decade in the de- 
velopment of new and in the improvement of old techniques. En- 
riching the old areas and exploring new ones have challenged some of 
the best minds of the world. Creative minds in pure as well as in 
applied mathematics have attacked fundamental problems so vigor- 
ously that now the literature of the field is colossal. 

Having been alert to these new developments and improvements, 
it is our wish to incorporate those that are appropriate into this new 
edition. In doing this we have sought to retain the main features of 
the first edition since the plan of its construction has met the approval 
of a wide audience of teachers and students. The two objectives, 
statistical description and statistical evaluation, have been kept in 
mind. In this edition we are not giving less attention to statistical 
description but we have been careful to give more emphasis to statis- 
tical evaluation and statistical induction. It is essential that the 
student be able to compute a statistic: it is just as essential that he 
know what he has when he has it, and to know, in terms of proba- 
bility, what he can do with it. Consequently, we have made a great 
effort to make the techniques and computations meaningful. At 
the risk of being prolix, we have given rather full verbal discussions 
of important matters; our illustrative examples are numerous and 
their solutions detailed. 

vii 



viii 


PREFACE TO THE REVISED EDITION 


Along with the progress that has been made in improving old 
techniques and in establishing new ones, there has come an enlarged 
opportunity for the study of statistical analysis by more and more 
students of our colleges. Due to its wide application a knowledge 
of statistical methods is now a “mu#t” in a program for a liberal 
education. Of course this growth has been influenced greatly by the 
desire of thinkers to replace as far as possible the subjective elements 
of their fields by objective procedures. On the whole, this substitu- 
tion of objectivity for mere opinion has been healthful. 

The thirteen chapters of this edition fall into two divisions, each 
division associated with a definite objective. The first ten chapters 
emphasize statistical description whereas the last three chapters em- 
phasize statistical induction. A study of the entire book is con- 
sequently necessary if one would seek an understanding of what is 
now considered to be the essentials of elementary statistics, statistical 
description and statistical induction. 

One new chapter, Multiple Correlation, has been added to the 
present edition. New sections pertaining to other topics have been 
inserted. Many sections have been completely rewritten, others 
greatly amplified. The numerical exercises have been multiplied 
and the algebra of statistics has been extended. The book has there- 
fore not only been revised but greatly enlarged, thus providing a 
wider selection of topics for the teachers. 

Many friends and teachers have rendered invaluable assistance 
with their sympathetic suggestions for the improvement of the book. 
These suggestions have come to me over the years. I wish that I 
might mention here each contributor personally but the list is too 
long. However, I do want to again express my thanks to my friend 
and former teacher, Professor A. It. Crathorne of the University of 
Illinois, whose generous and tactful suggestions have been invaluable. 
Also, I want to express my thanks to my colleague, Mr. Paul Benson, 
who has assisted with the proof and has made numerous helpful sug- 
gestions. Of course for any errors, I alone am responsible. 

In this edition I am including answers to many of the exercises. 
Obviously, it is too much to expect that all of them are correct. For 
the notification of any errors I shall be very grateful. 


Lewisburg, Pennsylvania 
September 1, 1943 


C. H. Richardson 



Contents 

1. INTRODUCTION 

SECTION PAGE 

1. THE MEANING AND IMPORTANCE OF STATISTICS 1 

2. MATHEMATICAL AND NON-MATHEMATICAL ASPECTS OF 

STATISTICS 4 

3. VARIABLES AND FUNCTIONS 5 

4. SUMS AND SUMMATIONS 7 

5. REMARKS ON MEASUREMENT 14 

6. DECIMAL ACCURACY 14 

7. SIGNIFICANT FIGURES 15 

8. ROUNDING OFF NUMBERS 16 

9. ERRORS IN CALCULATIONS 16 

10. THE PROPAGATION OF ERRORS 18 

2. TABULAR AND GRAPHICAL REPRESENTATION: 
FREQUENCY DISTRIBUTIONS 

11. INTRODUCTION 23 

12. CLASSIFICATION OF THE DATA 23 

13. THE CHOICE OF THE CLASS INTERVAL 30 

14. CLASS LIMITS 30 

15. GRAPHICAL REPRESENTATION 37 

16. GRAPHICAL REPRESENTATION OF FREQUENCY 

DISTRIBUTIONS 37 

17. GRAPHICAL REPRESENTATION OF TEMPORAL 

DISTRIBUTIONS 43 

18. CUMULATIVE DISTRIBUTIONS AND CURVES 48 

19. TYPES OF FREQUENCY CURVES 51 

20. SUGGESTIONS FOR TABULAR AND GRAPHICAL 

PRESENTATION 53 

3. MEASURES OF CENTRAL TENDENCY 

21. INTRODUCTION 59 

22. THE ARITHMETIC MEAN, Mx 60 

ix 



CONTENTS 


x 

SUCTION TAGS 

23. THE ARITHMETIC MEAN AS A MOMENT 62 

24. A SHORT METHOD FOR COMPUTING THE ARITHMETIC MEAN 71 

25. THE MEDIAN, Md 76 

26. THE MODE, Mo 80 

27. THE GEOMETRIC MEAN, M 0 87 

28. THE HARMONIC MEAN, Mh 92 

£9. DISCUSSION AND CRITICISM OF THE MEASURES OF CENTRAL 

TENDENCY 98 

A. THE ARITHMETIC MEAN 99 

B. THE MEDIAN 100 

C. THE MODE 100 

D. THE GEOMETRIC MEAN 101 

4. MEASUREMENT OF DISPERSION 

30. THE INADEQUACY OF MEASURES OF CENTRAL TENDENCY 111 

31. THE RANGE 114 

32. THE QUARTILE DEVIATION 115 

33. THE MEAN DEVIATION 120 

34. THE STANDARD DEVIATION 125 

35. THE NORMAL CURVE 134 

36. THE PROBABLE ERROR 137 

37. THE SIGNIFICANCE OF THE MEAN AND THE STANDARD 

DEVIATION 141 

5 . skewness: excess: moments 

38. INTRODUCTION 150 

39. THE MEANING OF SKEWNESS 150 

40. THE MEASUREMENT OF SKEWNESS 151 

41. EXCESS OR KURTOSIS 158 

42. THE UNADJUSTED MOMENTS OF A DISTRIBUTION 159 

43. THE ADJUSTED MOMENTS: SHEPPARD’S CORRECTIONS 163 

44. COMPUTATION OF THE MOMENTS 164 

45. RETROSPECT AND PROSPECT 169 

6 . INDEX NUMBERS 

46. INTRODUCTION 174 

47. RELATIVES 174 



CONTENTS jri 

SUCTION PACK 

48. DEFINITIONS AND NOTATION 177 

49. UNWEIGHTED INDEX NUMBERS 178 

50. WEIGHTING 184 

51. WEIGHTED AGGREGATES 185 

52. WEIGHTED AVERAGES OF RELATIVES 188 

A. THE WEIGHTED ARITHMETIC MEAN OF RELATIVES 188 

B. WEIGHTED GEOMETRIC MEAN OF RELATIVES 191 

53. SUMMARY AND EXTENSION 194 

54. BIAS 197 

55. fisher's IDEAL INDEX 198 

CONCLUSION 200 

7. LINEAR TRENDS 

56. INTRODUCTION 203 

57. SOME CHARACTERISTIC PROPERTIES OF A STRAIGHT LINE 204 

58. THE EQUATION OF A STRAIGHT LINE 206 

59. FITTING A STRAIGHT LINE TO OBSERVED DATA 210 

A. THE METHOD OF LEAST SQUARES 210 

B. THE METHOD OF MOMENTS 219 

60. THE STRAIGHT LINE WITH THE ORIGIN AT THE 

CENTROIDAL POINT ' 221 

61. FITTING A STRAIGHT LINE TO A TIME SERIES 226 

8. SIMPLE CORRELATION 

62. MEASURES OF CONCENTRATION OF POINTS ABOUT THE 

LINE OF REGRESSION 232 

63. THE BRA V AIS-PEARSON COEFFICIENT OF CORRELATION 237 

64. COMPUTATION OF T FOR UNGROUPED DATA 241 

65. OTHER FORMS OF T 244 

66. SUMMARY AND EXTENSION OF THE THEORY OF 

CORRELATION 247 

67. COMPUTATION OF T FOR GROUPED DATA 253 

68. CORRELATION BY RANKS 263 

69. CORRELATION AND CAUSATION 267 

9. MULTIPLE CORRELATION 

70. PRELIMINARY EXPLANATION 277 

71. THE CASE OF THREE VARIABLES 278 

72. CONTINUATION OF THREE VARIABLES 282 



CONTENTS 


PAOB 


xii 

no noN 

73. COEFFICIENT OF MULTIPLE CORRELATION FOR THREE 


VARIABLES 286 

74. DETERMINANTS 288 

A. DETERMINANTS OF THE SECOND ORDER 288 

B. DETERMINANTS OF THE THIRD ORDER 290 

C. DETERMINANTS OF ANY ORDER 292 

75. APPLICATIONS OF DETERMINANTS 293 

THREE VARIABLES 

76. PARTIAL CORRELATION 295 

77. THE CASE OF FOUR VARIABLES 297 

78. THE CASE OF n VARIABLES 303 


10. NONLINEAR TRENDS: CURVE-FITTING 

79. INTRODUCTION 

80. THE PROCESS OF DIFFERENCING 

81. FITTING A STRAIGHT LINE TO OBSERVED DATA 

A. THE METHOD OF SELECTED POINTS 

B. THE METHOD OF AVERAGES 

C. THE METHOD OF LEAST SQUARES 

82. THE EXPONENTIAL FUNCTION ? Y = ab X 

83. THE POWER function: Y = aX b 

84. the parabola: y = aX 2 + bX + c 

85. OTHER USEFUL CURVES 

A. THE hyperbola: Y = a + 

V X 

B. THE HYPERBOLA: Y = 1— Tv 

a + bX 

C. THE MODIFIED EXPONENTIAL: Y = a + bc X 

D. THE MODIFIED POWER FUNCTION: Y = C + aX b 

86. LIMITATIONS OF EMPIRICAL EQUATIONS 

87. GRAPHICAL METHODS IN TREND ANALYSIS 

A. ARITHMETIC PAPER 

B. SEMI-LOGARITHMIC PAPER 

C. LOGARITHMIC PAPER 

88. GOODNESS OF FIT OF CURVES TO OBSERVED DATA: 

NONLINEAR CORRELATION 

A. GOODNESS OF FIT 

B. NONLINEAR CORRELATION 


306 

307 
311 
311 

313 

314 
316 
323 
330 
333 

333 

334 

334 

337 

338 

340 

341 

342 
346 

354 

354 

355 



CONTENTS 


xiii 


11. PERMUTATIONS, COMBINATIONS, AND PROBABILITY 


SUCTION PACE 

89. INTRODUCTION 362 

90. PERMUTATIONS 364 

91. NUMBER OF PERMUTATIONS 366 

92. COMBINATIONS 367 

93. RELATIVE FREQUENCY: EMPIRICAL PROBABILITY 370 

94. THEORETICAL RELATIVE FREQUENCY: A PRIORI 

PROBABILITY 372 

95. EXPECTATION 374 

96. SOME ELEMENTARY THEOREMS 374 

A. MUTUALLY EXCLUSIVE EVENTS 374 

B. INDEPENDENT EVENTS 375 

C. DEPENDENT EVENTS 376 

97. REPEATED TRIALS 377 

12. THE POINT BINOMIAL AND THE NORMAL CURVE 

98. INTRODUCTION 383 

99 . CHARACTERISTICS OF THE POINT BINOMIAL 384 

A. THE MODE . 385 

B. THE MEAN, THE DISPERSION, THE SKEWNESS 387 

100. THE POINT BINOMIAL APPLIED TO FREQUENCY 

DISTRIBUTIONS 391 

101. THE NORMAL CURVE: INTRODUCTORY REMARKS 395 

102. DERIVATION OF THE EQUATION TO THE NORMAL CURVE 397 

103. SOME PROPERTIES OF <p(t) 401 

104. ILLUSTRATIVE EXAMPLES 405 

105. ON THE SIGNIFICANCE OF RESULTS 409 

106. GRADUATION OF A DISTRIBUTION BY THE NORMAL. CURVE 413 

A. GRADUATION BY ORDINATES 414 

B. GRADUATION BY AREAS 415 

13. THE THEORY OF SAMPLING*. MEASURES 
OF RELIABILITY 

107 . INTRODUCTION 419 

108. THE PROBLEM OF THIS CHAPTER 420 

109 . THE STANDARD DEVIATION IN CLASS FREQUENCIES 422 

110 . AN EXPERIMENT IN SAMPLING 425 



XIV 

8BCTION 


CONTENTS 


PAQB 

111. THE DISTRIBUTION OF MEANS 429 

A. THE MEAN OF THE MEANS 429 

B. THE STANDARD DEVIATION OF THE MEANS 430 

C. THE PROBABLE ERROR OF THE MEAN 432 

D. THE SKEWNESS AND EXCESS OF THE DISTRIBUTION 

OF MEANS 439 

112. THE RELIABILITY OF THE STANDARD DEVIATION 442 

113. THE RELIABILITY OF THE DIFFERENCE BETWEEN TWO 

MEASURES 443 

114. SMALL SAMPLES 450 

115. CONCLUDING REMARKS ON SAMPLING 456 

116. SUMMARY OF RELIABILITY FORMULAS 456 

APPENDICES 

A. SELECTED BOOKS FOR SUPPLEMENTARY READING 465 

B. AREAS AND ORDINATES OF THE NORMAL CURVE 467 

C. FOUR-PLACE LOGARITHMS AND ANTILOGARITHMS 471 

ANSWERS TO EXERCISES 475 

INDEX 495 



Chapter 1 

INTRODUCTION 

1. THE MEANING AND IMPORTANCE 
OF STATISTICS 

During the last half-century , 1 the thinking world seems to have 
awakened to an unusually deep appreciation of and respect for 
numerical facts. Even the untrained mind has confidence in a 
conclusion stated in numerical language and supported by numerical 
facts. Whether the affairs are of state or laboratory, we must have 
observed that quantitative facts concerning them are collected in 
boundless profusion. The social and biological sciences, which were 
qualitative a few decades ago, have now become largely quantita- 
tive. Masses of numerical data are collected by individuals, by 
corporations, by governments. 

These masses of data, numerical facts, measurements, which are 
generally known as statistics , may more precisely be called statistical 
data . The special methods used in the explanation and the elucida- 
tion of quantitative data may be fittingly called statistical methods . 
The analysis which is peculiar to and forms the basis of our method 
we call statistical analysis. 

The word statistics is generally used indiscriminately in two dif- 
ferent senses: on the one hand to refer to statistical material, the 
group of numerical data; and on the other hand, to statistical 
analysis, which includes those technical operations that have to do 
with the explanation and the interpretation of the numerical data. 

As we shall use the term, statistics is the science which deals with 
the collection , the organization , the analysis , and the interpretation of 
masses of numerical facts. It will be noted that this definition is 
broader in scope than that given by Yule and Kendall. They say , 2 

1 This is not meant to imply that statistics is a new subject. See the Book of 
Numbers in the Bible. See also H. M. Walker, Studies in the History of Statistical 
Method , 1929. 

* G. U. Yule and M. G. Kendall, Introduction to the Theory of Statistics , 
12th ed., p. 3. 1 



2 


INTRODUCTION 


By statistics we mean quantitative data affected to a marked extent 
by a multiplicity of causes. 

By statistical methods we mean methods specially adapted to the elu- 
cidation of quantitative data affected by a multiplicity of causes. 

By theory of statistics we mean the exposition of statistical methods. 

Statistical methods are fundamentally the same whether employed 
in the analysis of physical phenomena, the study of educational 
measurements, the records of biological experiment, or the analysis 
of quantitative material in economics. All such data are “affected 
to a marked extent by a multiplicity of causes.” True, the physicist, 
the chemist, the biologist,* and possibly the psychologist attempt to 
eliminate many disturbing causes and to concentrate their atten- 
tion upon one or two most powerful influences affecting their phenom- 
ena, yet many disturbances are always present. However, the 
same general procedure is followed by the educationist and the 
economist. Generally, it is one of continued summarization. 

We shall therefore feel free to apply our analysis to numerical data 
whether they come from the astronomer or the agriculturist, the 
physicist or the economist, the biologist or the chemist. Wherever 
there is a mass of numerical data that admits of explanation, we shall 
consider its analysis our field of endeavor. 

The fact that the human mind is incapable of comprehending a 
large number of impressions at one time is generally recognized. A 
mass of numerical data is an appropriate illustration. To grasp the 
meaning of a mass of numerical data we must reduce its bulk. The 
organization of the data is the first step in the summarizing process. 
It is a phrase that is used to describe the process of arranging the 
data in a compact form that facilitates computations and comparison. 
When they are so arranged, ordered, classified, — organized, — they 
are then in a form suitable for the analysis. 

The process of abstracting the significant facts contained in a mass 
of numerical data and making clear and concise statements about the 
derived results constitutes a statistical analysis of the data. A 
statistical analysis, therefore, enables us to express the relevant 
information contained in the mass of data by means of a few numeri- 
cal values known as statistical constants or parameters, each constant 
describing an important property of the mass of data. It is thus the 
purpose of statistical analysis to give a summarized and compre- 



MEANING AND IMPORTANCE OF STATISTICS 


5 


hensible numerical description of masses of numerical data. This Is 
effected by computing a few constants pertaining to the data and 
understanding their meaning. 

Toward this numerical description of the mass of data we may 
adopt two points of view. We may view the description of the given 
mass as an end in itself, or we may view it as a basis for generaliza- 
tion, as a basis for making estimates of the character measured 
pertaining to a larger group. The smaller group that is analyzed 
we call a sample, the larger group about which we make estimates 
we call the parent population or universe . The interpretation of the 
data is a phrase used when we adopt the larger point of view and 
make estimates, form judgments, or draw inferences of the universe 
from a study of the statistical properties of the sample. 

Let us consider, for example, the scores of 100 freshmen at Buck- 
nell University on a standardized Algebra test on which the highest 
attainable score was 50. Here are the scores in Table 1. 

Table 1. Scores of 100 Freshmen on an Algebra Test 


43 

18 

2*5 

18 

39 

44 

19 

20 

20 

26 

40 

45 

38 

25 

13 

14 

27 

41 

42 

17 

34 

31 

32 

27 

33 

37 

25 

26 

32 

25 

33 

34 

35 

46 

29 

24 

31 

34 

35 

24 

28 

30 

41 

32 

29 

28 

30 

31 

30 

31 

28 

31 

30 

34 

40 

29 

46 

30 

30 

47 

31 

35 

36 

29 

26 

32 

36 

35 

36 

37 

32 

23 

22 

29 

33 

37 

33 

27 

24 

36 

23 ! 

42 

29 

37 

19 

23 

44 

41 

45 

39 

21 

21 

42 

22 

28 

38 

15 

16 

17 

28 


If we desire such information regarding the above scores as is 
found in the answers to the following questions, we must look through 
the entire table. 

1. How many students obtained scores greater than 43? 

2. How many obtained scores greater than 22 and less than 43? 

3. How many obtained scores less than 23? 

4. What is the lower boundary of the upper 20% of the scores? 

5. What is the lower boundary of the upper 40% of the scores? 

Such questions may be readily answered if we organize the data 
by arranging the scores into classes, as we have done in Tabie 2. 



4 


INTRODUCTION 


Table 2. Organization of 
the Data of Table 1 


Class 

Frequency , or 
the number of 
scores in the 
given class 

42.5-47.5 

8 

37.5-42.5 

12 

32.5-37.5 

20 

27.5-32.5 

28 

22.5-27.5 

16 

17.5-22.5 

10 

12.5-17.5 

6 

Total 

100 


The new table is called a frequency table for it gives the frequency 
(the number of scores) in the respective intervals. Evidently the 
organization of the data presents them in a form that is more suitable 
for statistical purposes than the disorganized form of Table 1 does. 

Exercise. Make a list of several facts that you can immediately 
discover from Table 2. Answer the questions that we have listed on the 
preceding page. 

It must not be supposed that the answers to the above questions 
constitute the analysis of the data. The analysis is contained in 
the following constants that we shall later learn to compute and 
interpret. 

M = 30.7 a = 7.85 

M d = 30.71 Qi = 25.3 

Mo = 30.5 Q, = 36.25 


each expressed in the given unit of measure. 

To undertake an interpretation at this time would take us too far 
afield. 


2. MATHEMATICAL AND NON-MATHEMATICAL 
ASPECTS OF STATISTICS 

As has been indicated in our definition, the steps involved in the 
solution of a statistical problem may be summarized as follows: 



MATHEMATICAL ASPECTS OF STATISTICS 


5 


1. The collection of the data 

2. The organization of the data 

3. The analysis of the data 

4. The interpretation and criticism of the results 

The collection of the data and their organization are largely non- 
mathematical operations. In regard to this Bowley says, “ Common 
sense is the chief requisite and experience the chief teacher.” 1 
However, we shall refer to these items in later chapters. The ele- 
mentary analysis of the data involves in general no so-called higher 
mathematics. It is well to understand algebraic averages and some 
of the elementary principles of the algebra of summations. These 
will be considered in another section of this chapter. An under- 
standing of the calculus is always helpful and at certain times highly 
desirable, but this preparation is not necessary for the elementary 
course. The student who desires a knowledge of the more refined 
methods of statistical analysis will find an understanding of the 
calculus and the theory of probability indispensable . 2 For the 
interpretation and the criticism of the results, one cannot know 
too much. Bowley says in this regard: 

For criticism of estimates and interpretation of results it is necessary 
to use formulae of more advanced mathematics, and it is obviously ex- 
pedient to understand the methods by which these formulae are obtained 
to ensure their intelligent use. 3 

Since this book is essentially one of the methods of elementary 
analysis of statistical data, all technical questions that require a 
considerable knowledge of advanced mathematics will be omitted. 

3. VARIABLES AND FUNCTIONS 

A common property of any character with which statistics is con- 
cerned is that of variation or change. The grades of a class in 
geometry, the scores in an examination, the heights of a group, the 
number of petals on a group of buttercups, the production of wheat 
from year to year, even a group of measurements of the length of a 
room — all these show variation. In statistics the magnitudes of 

1 A. L. Bowley, Elements of Statistics , p. 14. 

* E. V. Huntington, “ Mathematics and Statistics,” American Mathematical 
Monthly , December, 1919. 

• Bowley, op. cit., p. 14. 



6 


INTRODUCTION 


the character measured are frequently called variates . A variate, 
then, is a very special and specific use of the broader term variable , 
which signifies any quantity that changes in magnitude. 

Table 3. Grades of a Class in Algebra 


Grade 

X 

Frequency , or the number of 
students receiving the grade 
f(X) 

65 

3 

75 

14 

85 

10 

95 

3 

(Total) 

30 


In Table 3 there are two variables — the grades, X, and the 
frequencies, f(X) — but the variates are the magnitudes of the 
grades. 

We shall find it necessary and convenient to recognize two distinct 
classes of variates, continuous, and discrete or discontinuous. 

A continuous variate is one whose magnitudes may differ by 
infinitesimal amounts between certain limits: for example, the 
weight of a man, the temperature of a place, the length of a bean pod, 
the height of a plant. 

A discrete or discontinuous variate is one whose value must be 
described in integers; for example, the number of pupils in a class, 
the number of kernels on an ear of corn, the number of seeds in an 
apple, the number of culms on an oat plant. A discrete variate is 
sometimes called an integral variate. 

As in other fields of mathematics, it will be convenient to recall 
another classification of variables, namely, the independent and the 
dependent. The independent variable is the one to which we assign 
values at pleasure, whereas the dependent variable is the one whose 
value depends upon that assigned to the independent variable. 

In Table 3 the frequency of a given grade depends upon the grade 
we have assigned at pleasure. The frequency is, therefore, the 
dependent variable. The dependent variable is frequently called a 
function of the independent variable or argument. The independent 
variables will be represented in this text by X , x', x, or t; the func- 
tional, or dependent variable by y, f(x), or fit), etc. We shall say 



VARIABLES AND FUNCTIONS 


7 


that y or f{x) is a function of x if y is dependent upon x and if to every 
value of x there corresponds a value of y or fix). It will not be necessary 
to describe this correspondence by means of an equation. 


4. SUMS AND SUMMATIONS 

Statistics may be roughly defined as the study of averages. Since 
nearly all averages involve the evaluation of certain sums, it will 
be well at this point to acquaint ourselves with an abbreviated 
notation for sums and develop some useful formulas that will aid 
us in quickly evaluating later sums. We shall discover that a facility 
with this new notation will quicken our understanding of the later 
chapters. 

In elementary algebra if we add a set of letters x h x 2 , . . ., x n we 
indicate the sum by: 

Xi + X 2 + X 3 + * ' * + X n 

In more advanced mathematics we would designate this sum com- 

n 

pactly as 2a\*. Thus: 

i = l 

n 

2a;< = Xi + Xi + Xg + • • • + x n 

1=3 1 


This should be read “ sigma of x sub i (or summation of x sub i) 
when i assumes all integral values from 1 to n inclusive.” 

The Greek capital letter 2 (sigma) placed before a term signifies 
the sum of all terms of which that term is the general type. 

Thus: n 

l 2 + 2 2 + 3 2 + • • • + n 2 = 2x 2 

1 


1 + 1 + 1 + 
1 + 2 + 3 + 



2 s + 3 3 + 4 3 + 5 3 + 6 3 = Sx 3 

2 

n 

log 5 + log 7 + log 9 + • • • + log (2 n — 3) = S log (2a; — 3) 

4 

and in general 

/( 1) +/( 2) +/(3) + • • • +/(») = 2/(x) 

JC=1 


( 1 ) 



8 


INTRODUCTION 


The values below and above the summation symbol 2 which are 
the initial and final values of the independent variable are the 
limits of the summation. The limits of the summation may have 
any values and the independent variable may change by other 
amounts than unity. Slight changes in the notation make these ex- 
tensions possible. Thus: 

100 20 

Xfi + Xio + X 15 + ' * ‘ + Xioo = 2 Xi = 2x 6 1 

i — 5 , 10 , . . . 2 — 1 

95 

/(65) + /( 75) + /(85) + /(95) = 2 f(x) 

65 , 75 , . . . 

5/(5) + 10/(10) + 15/(15) + • • • + 75/(75) = 25x/(5x) 

l 

/(a) + /(a + b) + /(a + 26) + • • • + /(a + rib) =*2/(a + xb) 

T = 0 

If the quantity under the summation 2 does not contain a variable, 
all the terms are equal. As examples we have: 

21 = 1 + 1 + 1 + 1+ - • • 1 = N 

2c=c + c + c + c+ ***+c = Nc 
where N is the number of observations or measurements. 

It frequently happens that there is no necessity for writing the 
independent variable or the limits of the summation below and above 
the summation symbol. When the context tells us what is meant, 
we shall resort to this plan. Thus, if the lower or upper limit is 
omitted, it is assumed to be 1 or n respectively. 

EXERCISES 


Write the series that are represented by the following symbols: 



n i 

n 

n 


10 

1. 


2 . 22* 

3 . 2(x - 3) 

4 . 

2i 0 C* 


IX 2 

I 

l 


x= 1 


n 

1° 

100 


80 

6 . 

2xa x 

6 . 2Xio(7a? 

7 . 2 xf(x) 

8 . 

2^/(x) 


r» 1 

1 

10, 20, . . . 

1 

1, 10. . . . 

Write in the abbreviated form using 2. 



9. 

1 • 2 + 2 • 

3 + 3-4 + 

• • • + n(n + 1) 



10 . 

C x x - My 

+ (X, - My 

+ (x, - My + ■ ■ 

• +(X n 

- AT) 2 


We will now consider two useful and important theorems. 



SUMS AND SUMMATIONS 


9 


Theorem I. The 2 [sigma) of an algebraic sum of several functions 
is equal to the algebraical sum of the sigmas of the several functions. 
Symbolically, we state that: 

2Qf(x) ± F[x ) db w[x) ± etc.] = 2 fix) =t 2 F[x) ± 2w(x) ± etc. 

Theorem II. The 2 [sigma) of a constant times a function is equal 
to the constant times the sigma of the function . Symbolically, we state 

that: 2 c/(x) = c2/(x) 

This means of course that the constant factor,' c, may be placed to the 
left or to the right of the 2 at pleasure. 

Proof of Theorem I. (for two functions) 

By the definition (1): 

2[/(x) ± F(x)] 

= /(l) ± F( 1) + /( 2) ± F( 2) + • • • + /(n) ± F(n) 

= [/(l) + /( 2) + • • • + /(n)] ± [F( 1) + F(2) + • • • + F(n)] 

= 2/(x) ± 2F(x) 

The proof is easily extended to any number of functions. 

The proof of Theorem II will be left as an exercise for the stu- 
dent. 

Example. From the identity. 

x 2 - (x - l) 2 = 2x - 1 

we shall prove : 

5 n ( n + !) 


From the preceding definitions and theorems we have 

2[x 2 - (x - l) 2 ] = 22x - 21 = 22x - n 
and this means, by (1) : 

■ — 02 

+V ^ = 2 2x - n 


or 

Hence, we obtain: 



l ) 1 

n 1 = 22x — n 
„ n(n+l) 



'to 


INTRODUCTION 


EXERCISES 


ti(n -f- 1) 

1. Write out in full what is meant by 2a; 

z 

2. Using the identity 

x* ~ (x - 1)* = 3a: 2 - 3a; + 1 


prove that: 


n 


2a; 2 

l 


n(n + 1) (2 n + 1) 
6 


State in words. 


3. Prove: 

2 (2a; - 1) = n 2 . 

l 

4 . Apply the result of Number 3 to find the sums: 

44 5 

(1) 11 + 13 + 15 + • • • + 87 = 2 (2a; - 1) - 2(2a; - 1) 

l i 


(2) 127 + 129 + 131 + • • • + 195 


The above definitions, theorems, and exercises are concerned with 
the abstract algebra of summation. The numbers that have appeared 
as illustrations enjoy a degree of regularity not found in observed 
measurements. Thus, such series as : 1,2,3, . . .,25; l 2 , 2 2 , 3 2 , . . 

16 2 are not often found in actual measurements. So this abstract 
algebra is apparently not valuable in dealing with observed data. 
Actually, we shall find this abstract algebra of summation very help- 
ful in developing statistical theory and frequently helpful in dealing 
with real measurements. 

The numbers we meet in numerical problems are real measure- 
ments or scores that come from actual observation and they do not 
generally proceed with regularity. We should understand thoroughly 
how the algebra of summation applies to such measurements. Ten 
men in a class in Statistical Analysis gave their weights: 128, 131, 
137, 143, 144, 146, 147, 149, 155, 170 pounds. Obviously these 
numbers do not proceed from the small to large values with the 
regularity of the numbers: l 3 , 2 3 , 3 3 , . . ., 12 3 . However, we can 
apply our 2 notation to such irregular series as the weight data. 

Let us arrange our data as in the adjacent vertical columns where 
we use the upper case X , capital X , to indicate a measurement of 
weight. The first measurement we indicate by X h the second by X 2 , 
and so on. We are then able to express the sum of the weights by the 



SUMS AND SUMMATIONS 


11 


summation sign, 2. 

Weight 


X , = 128 
X 2 = 131 
X 3 = 137 
X 4 = 143 
X 6 = 144 
A, = 146 
A, = 147 
A 8 = 149 
A 9 = 155 
A 10 = 170 


Out of curiosity we found the average weight of 
the ten men. 

In this case we note that it is the subscript i 
that varies, and with the weight of each man is 
associated a subscript. It is generally not neces- 
sary to indicate so precisely the indexes of the 
summation. It suffices to know that X refers to 
the characteristic measured, weight of a man, 
and 2X represents the sum of the weights of 
the men. Consequently we could write 

Sum of the weights = 2X = 1450 pounds 


and not be misunderstood. 

To save labor in statistical computations we 
find it convenient to effect simple transforma- 
tions upon the variates. Thus, for the weight 
data we can work with smaller numbers if we 
refer our weights to some conveniently chosen number, say 100, 
instead of to 0. Since X represents the measurements referred to 
zero as origin, we must choose a new letter, say U y to represent 


10 

2 X % = 1450 

<* l 

Average __ 1450 
weight nr 

' = 145 lbs. 


Table 4. Weights of 10 Men 


Weight 

X 

U = X - 100 

A, = 128 

U, = 28 = A! - 100 

X 2 = 131 

lh = 31 = X 2 - 100 

X 3 = 137 

Us = 37 = X 3 - 100 

X 4 = 143 

u t = 43 = X< - 100 

X 6 = 144 

U, = 44 - X 8 - 100 

X 6 = 146 

Ut = 46 = X 6 - 100 

A, = 147 

U 7 = 47 = A, - 100 

As = 149 

Ua = 49 = As - 100 

A, = 155 

U 9 = 55 = A, - 100 

X, 0 = 170 

*/,o = 70 = Aw - 100 


Adding, Stf = 450 - XX - 10(100) 
2X = 10(100) + 450 

tisr-f - 100 + 45 -.45 lbs. 



12 


INTRODUCTION 


the measurements referred to 100 as origin. We pose the ques- 
tion : Can we find XX by finding and using 2 U ? 

The first weight whose X is 128 will have a U of 28; the second 
weight has X = 131 and U = 31; and so on. Obviously we have 

U = X - 100 

for each of the measurements. The detail is shown in Table 4. 

In practice we do not go into such detail. We abbreviate our work 
a great deal as is indicated in Table 5. We follow a few systematic 
steps: 

(1) We decide upon the transformation we wish to use. For the 
weight data we use U = X — 100. 

(2) We complete the table to agree with the chosen transformation. 
That is, we find the U that corresponds to a given X. 

(3) We derive a formula to agree with the chosen transformation. 
Thus, when U = X — 100, we have 

X = U + 100 

XX = X (U + 100) = XU + 2100 
XX = XU + 10(100) 

since N , the number of measurements, is 10. 

(4) We substitute the values found from the table, step (2), into 
the formula we derive in step (3). 



SUMS AND SUMMATIONS 


13 


We now submit two more illustrations that involve simple trans- 
formations. In each case our problem is to find 2X by using 2 U. 



XX = 2125 (7 = 1252 17 XX = 2(128(7 + 384) = 2128(7 + 2384 

XX = 125(15) = 1875 XX = 1282(7 + N( 384) 

XX = 128(3) + 6(384) = 2688 


EXERCISES 

1. Complete the following tables and find the values of the quantities 
suggested. 



Xx = ( ) 2(7 = ( ) 2(7* = ( ) 

(2x)» - ( ) Show that XX = 10(50) +2(7 

2x* = ( ) XX = ( ) 

V2i 2 = ( 1 Can you find 2X* by using 2(7 

W and XIP1 


XX 1 = ( ) 



14 


INTRODUCTION 


d 


X 

Y 

X* 

XF 

F 2 X 

F 

x = X - 6 

i> 

1 

II 

?S5 

X 2 

xy 

2 

11 



2 

11 





4 

8 



4 

8 





6 

7 



6 

7 





8 

5 



8 

5 





10 

4 



10 

4 









30 

35 









Average 6 

7 


1 




2X 2 = ( ) 
2F 2 = ( ) 
(2X)(2F) = ( ) 
2X7 - ( ) 


2x = () 
2* 2 = ( ) 


22/ = ( ) 
2 * 2 / = ( ) 


■ ±xy 

2a: 2 


- ( ) 


5. REMARKS ON MEASUREMENT 

It is almost a commonplace that nearly all the numerical data 
used by the statistician are necessarily approximations, true usually 
to two, three, or more figures. The observer who collects the original 
data and the statistician who undertakes to analyze and interpret 
them are frequently different individuals. The statistician must 
accept the measurements that are given him and should seek to 
obtain results that are consistent with the data. 

The degree of approximation of a measurement depends upon the 
skill and the carefulness of the operator and upon the kind of instru- 
ment used. Of course it may happen that a measurement is exactly 
the true value of the quantity measured, but the measurer can never 
know when this is so. Since statistics has to do primarily with 
observed measurements, which are admittedly approximations, and 
with processes that are also approximative, it is obvious that any 
numerical result computed from them will in like manner be an 
approximation. 

6. DECIMAL ACCURACY 

The average student may have some difficulty in grasping the idea 
that accuracy is a relative matter and absolute precision of measure- 
ment an impossibility. He is accustomed to think of 9.7 as meaning 
the same thing as 9.70 and even 9.70000000 ... to an unlimited 
number of decimal places. 



DECIMAL ACCURACY 


15 


If 9.7 does not mean the ideal number 9.7000000 . . what does 
it mean? For the sake of clarity of understanding and precision of 
statement, the scientist has adopted the convention that “9.7” 
means between 9.65 and 9.75. If we record the length of a line as 
18 inches we mean that it lies between 17.5 inches and 18.5 inches. 
When we say that the distance to the moon is 240,000 miles, we 
mean that that distance is between 235,000 and 245,000 miles. 

A measurement recorded as 18 inches means that the measurement 
is correct to the nearest inch or to units. A measurement 1 recorded 
as 9.7 cm. means that the number is correct to the nearest tenth of a 
centimeter. The number is sometimes written 9.7 dh 0.05 in which 
the expression 0.05 should be read “with a possible error of 0.05.” 

Similarly, a recorded value of 9.70 would mean from 9.695 to 9.705 
and might be written 9.70 dt 0.005. 

Unless otherwise specified, a score for a continuous variate should 
be interpreted as extending from half a unit of the last place of the 
measurement below to half a unit above the recorded entry. A 
similar assumption regarding discrete data avoids confusion in the 
analysis. Hence we shall assume that a measurement for a discrete 
variate extends from half a unit below to half a unit above the 
recorded score. 


7. SIGNIFICANT FIGURES 

In the expression 9.7 cm., both the 9 and the 7 mean something or 
are significant. In the expression 97 mm., there are likewise two 
significant figures. There are five significant figures in each of the 
numbers 203.05, 263.10, 0.0076389, 500.00, but only two in the 
number 93,000,000 which gives the approximate number of miles 
from the earth to the sun. 

When the distance from the earth to the sun is given as 93,000,000 
miles, in the light of the convention that we discussed in the pre- 
ceding section, the statement might be interpreted to mean that the 
distance is between 92,999,999.5 and 93,000,000.5 miles. Since the 
figures 9 and 3 alone are to be regarded as significant, the exact dis- 
tance is between 92,500,000 and 93,500,000 miles. This confusion can 
be prevented by writing the number in the standard form 9.3 X 10 7 , 

1 If a measurement may be written a ± e, we call e the possible error In a, 
the measurement. 



16 


INTRODUCTION 


the number of significant figures being indicated by the factor at 
the left which has one figure before the decimal point. 

We determine the significant digits in a number by reading the 
number from left to right, commencing with the first digit not zero 
and ending with the last digit accurately specified. The position of 
the decimal point has no influence on the number of significant digits. 

Thus 34 has two significant figures; 7.3, two; 406, three; 7,003, 
four; 8.0, two; 0.40, two; 9.00, three; 0.006, one; 0.0050, two; 
and 2.4 X 10 6 , two. 

8. ROUNDING OFF NUMBERS 

Sometimes we are furnished with numbers recording measurements 
that are given with a greater accuracy than we can use, or care to use. 
We accordingly round them off to the accuracy desired. 

.A number is rounded off by dropping one or more digits at the 
right. When the digit dropped is 5 or more, increase the preceding 
digit by unity; when it is less than 5, retain the preceding digit 
unchanged. 

The following numbers are rounded according to the above rule: 

Numbers Rounded Values 

4.5647 4.565; 4.57; etc. 

0.49781 0.498; 0.50; etc. 

17.65 17.7 

17.75 17.8 

9. ERRORS IN CALCULATIONS 

As magnitudes determined by measurement are not exact, it is 
important to make clear the meaning of the term error as it is used in 
statistics. 

In the first place, errors are not necessarily what we usually think 
of as mistakes or blunders. The latter arise from carelessness or 
incompetency in transcribing figures or reading values from a scale. 
An absolute error in observation is the difference between a given 
measurement and the true value of the quantity measured. There- 
fore, an error means a deviation, a difference, but not a mistake. 

The relative error in a measurement is the ratio of the absolute error 
to the true value of the quantity. It may be closely approximated 
by finding the ratio of the possible error to the given measurement. 



ERRORS IN CALCULATIONS 


17 


It is usually expressed as a percentage. Thus if a measurement of 
height is given as 68.5 inches, there is a possible error of 0.05 inches 
and an approximate relative error of 0.05/68.5 = 0.0007, which equals 
0.07 per cent. If a physician reports the weight of a man as 163 
pounds with a possible error of 0.5 pound, the approximate relative 
error may be written as 0.5/163 = 0.003 = 0.3 per cent. 

The relative errors in the two distances 9.3 X 10 7 and 9.30 X 10 7 
are approximately : 

= 0.005 = 0.5% and = 0.0005 = 0.05% 

This illustrates the fact that the relative error depends upon the 
number of significant figures in and not upon the position of the deci- 
mal point in a recorded measurement. 

EXERCISES 

1. How many significant figures are in the following numbers? 

(1) 2.375 (2) 0.0347 (3) 0.0030 (4) 5.63(10*) (5) 5.6300(10*) 

2. What is the rule to be observed when rounding off numbers? 

3 . A line is measured and its length is recorded as 118.63 feet. What 
does this statement mean? What is the approximate relative error in 
the measurement? 

4. A line is measured and its length is recorded as 125.65 feet. What 
does this statement mean? What is the approximate relative error in the 
measurement? 

6. The population of a city is given as 2.5 million. What is the 
approximate percentage error? 

6. The population of a city is given as 340 thousand. What is the 
approximate percentage error? 

7. The value of 7r correct to five significant figures is 3.1416. De- 
termine the percentage error when tt is approximated by 3*. 

8. The values of all mineral production in continental United States 
in 1929, correct to the nearest million dollars, was $5,165,000,000. Write 
this value in the standard form. Find the approximate percentage error 
in the given estimated value. 

9. Prove : 

X2x = n(n + 1). 

1 

10. Use the result of Number 9 to find the sums: 

(1) 30 + 32 + 34 + • • • + 96. 

(2) 128 + 130 + 132 + • • • + 164. 



18 


INTRODUCTION 


11 . Find the sum of the following numbers correct to two decimal 
places: 2.4286, 12.673, 127.87, 35.583: 

(1) By retaining the significant figures of the numbers and rounding off 
the sum to two places; 

(2) By rounding off each number to two places and finding the sum. 
This exercise illustrates the rule: “When several approximate numbers 

are to be added, it is best to round them at once to the number of decimal 
places in the least accurate measurement.” 

12 . Find the sum of the following numbers correct to two decimal places: 
3.4285, 16.743, 253.78, 36.583: 

(1) By retaining the significant figures of the numbers and rounding off 
the sum to two places; 

(2) By rounding off to two places each number and finding the sum. 

10. THE PROPAGATION OF ERRORS 

'In general, statistical computations are more concerned with 
relative than with absolute errors. We shall include here the more 
important theorems that relate to relative errors and expect the 
reader who desires a wider knowledge to consult the splendid work 
by Scarborough . 1 

Theorem I. The possible error in the sum or the difference of two 
measurements is equal to the sum of the possible errors in the indi- 
vidual measurements. 

Suppose a and b are the readings of the two measurements and that 
e\ and e 2 are the numerical values of their errors. The true values are 
therefore a + e x and b + c 2 , where e x and e 2 may be either positive or 
negative. The correct value of the sum of the measurements lies 
between the limits : 

(a + ei) + (b + ef) = (a + b) + (ci + ef) 

and 

(a — ei) + (b — ef) = (a + b) — (e x + ef) 

Hence the possible error in the sum, a + b, is e x + e 2 . 

The correct value in the difference of the measurements lies be- 
tween the limits : 

(a + ef) — (b — e 2 ) = (a — b) + (e x + e 2 ) 

and 

(a - ci) - (b + ef) = (a - b) — (ci + ef) 

1 J. B. Scarborough, Numerical Mathematical Analysis , p. 2. 



THE PROPAGATION OF ERRORS 


19 


Hence the possible error in the difference, a — 6, is ei + e 2 . 

Example. The sides of a rectangular field are measured to be 127' ± 
0.2' and 231' 0.4'. Find the possible error in the sum of the two sides. 

We have: 

a = 127, b = 231 e 1 = 0.2, e 2 = 0.4 

a -j- b = 358 €i “h €2 — 0.6 

Hence the possible error is 0.6' and the true value of the sum of the two 
sides is between 358 — 0.6 and 358 + 0.6 feet. 

Theorem II. The relative error in the product of two measurements 
is equal to the sum of the approximate relative errors of 4he individual 
measurements. 

With the same notation as above, the product will lie between: 

(a + Ci) ( b -f" c 2 ) — ab -f- ae 2 -f- bc\ -f- C\e 2 

and 

(a — Ci) (b — ef) = ab — ac 2 — bci + cic 2 

Since e\ and c 2 are both small when compared to the other terms 
of the products, we shall ignore the term cw 2 . We then have the 
possible error in the product to be approximately ae* + bci. 

Hence the relative error in the product is approximately 

aei + be 1 _ ei e 2 

ab a b 

which is the sum of the approximate relative errors. 

Example. Find the absolute and the relative errors in the computed 
area of the rectangle whose sides are 127' ± 0.2' and 231' =fc 0.4'. 

The possible error in the product is approximately 

127(0.4) + 231(0.2) = 97 square feet 
and the true value of the area is somewhere between 

(127)(231) - 97 and (127)(231) + 97, 

that is, between 29,240 and 29,434 square feet. 

The relative error in the area is approximately: 


?£ + ~ = 0.0032 = 0.32% 



20 


INTRODUCTION 


Theorem III. The relative error in the quotient of two measurements 
is equal to the sum of the approximate relative errors of the measure- 
ments. 


The quotient will evidently lie between: 
a e i 


and 


b — e 2 


a . ae 2 be 1 
6 + b(b - e 2 ) 


a — e 1 _ a _ ae 2 + be 1 
6 + e 2 6 6(6 -F 62) 


Since 62 is small compared with 6, we may, for purposes of approxi- 
mation, replace b + e 2 and 6 — e 2 by 6; whence the possible error 
in the quotient is approximately : 

ae 2 H" bei 
b 2 


Hence the relative error in the quotient is given approximately by 


ae 2 + bei # a 
' b 2 * 6 


Ci 

a 6 


which is the sum of the approximate relative errors. 

Example. Find the possible and the relative errors when 625 ± 0.7 
is divided by 36 ± 0.2. 

We have: 

a = 625, e\ = 0.7 
6 — 36, €2 — 0.2 


The possible error in the quotient is given approximately by 


625(0.2) + 36(0.7) 
36- 


0.12 


and the true value of the quotient will therefore lie between 


625 

36 


62 ^ 

0.12 and — + 0.12 
36 


that is, between 17.24 and 17.48. 

The relative error in the quotient is given approximately by: 



THE PROPAGATION OF ERRORS 


21 


EXERCISES 

Make each of the following computations and state the result so as to 
show a measure of the error involved. 

1. (125 =fc 0.2) + (238 ± 0.3). 

2. (215 ±0.2) (115 ±0.3). 

3. (163 ± 0.2)/(25 ± 0.4). 

4. What is the possible error in the area of a rectangle whose length 
and width are recorded as 50.4 ft., and 30.6 ft.? 

5. a. Show that if e is the error in the side of a square whose recorded 
length is a, then the error in the area is approximately 2 ae. 

b. Show that the relative error in the area is approximately twice the 
relative error in the edge. 

6. Show that the relative error in the area of a circle is approximately 
twice the relative error of the radius. 

7. The distance from the earth to the sun is given as 93,000,000 db 
500,000 miles, and the thickness of a watch spring is given as 0.014 ± 
0.0005 inches. Which is the more accurate measurement? 

8. Show that the relative error in the volume of a cube is approxi- 
mately three times the relative error of the edge. 

9. Show that the relative error in the volume of a sphere is approxi- 
mately three times the relative error of the radius. 

10. Are statistical data always approximate? If each of 10 men pays 
an income tax of $87, is their total contribution $870 approximate? 

11 . Find the value of 2(12x 2 — 4x + 3). 

12. Find the sum of 3-7 + 4-9 + 5*11 + • • • to n terms. 

13. Find ll 2 + 12- + 13 2 + • • • + 50 2 . 

14. Prove: 

22jr(3x + 1) = 2 n(n + l) 2 

l 

16. Use the result of Number 14 to find the sums: 

(1) 12 • 19 + 14 • 22 + 16 • 25 + • • • + 32 . 49. 

(2) 36 • 55 + 38 • 58 + 40 • 61 + • • • + 80 • 121. 

16. Prove that 

22*(2* + 1) - 
l 3 

17. Use the result of Number 16 to find the sums: 

(1) 12 • 13 + 14 • 15 + 16 • 17 + • • • 4- 96 • 97. 

(2) 48 • 49 + 50 • 51 + 52 • 53 + • • • + 90 • 91. 

18. Find in terms of n the value of 2x(x + 1) 

19. Find in terms of n the value of 

1-3 + 2- 4 + 3- 5+ * • - ton terms. 



22 


INTRODUCTION 


20 . Find identities and prove that: 

~n(n + 1)~ 


a. 


b. 


_ n(n + l)~j 2 
“ L 2 _ 


n 

Sz 3 

1 

| 4 = n(n + l)(2n + l)(3n 2 + 3n - 1) 
1 30 


100 

21 . FindSz 3 . 

50 

22 . Find 1 • 2 2 + 2 • 3 2 + 3 • 4 2 + • • • to n terms. 

23 . The estimated value of anthracite coal produced in Pennsylvania 
in 1929 was $3.935(10 8 ) and the estimated quantity produced was 7.664(10 7 ) 
tons. What was the estimated value per ton and what was the relative 
error in the estimate? 

24 . The estimated production of tobaoco in the United States in 1929 
was 1.5(10 9 ) pounds and the estimated price received was 10.0 cents per 
pound. What was the estimated value of the crop? What was the per- 
centage error in the estimated value? 

25 . The estimated production of potatoes in the United States in 1929 
was 3.57(10 8 ) bushels, and the estimated price per bushel was 131.4 cents. 
What was the estimated value of the crop? What is the relative error in 
the estimated value? 

26 . The estimated production of potatoes in the United States in 1929 
was 3.57(10 8 ) bushels and the estimated acreage was 3.37(10 6 ). What was 
the estimated yield per acre? What is the relative error in the estimated 
yield? 

27 . A teacher’s salary of $350 a month was decreased 10 per cent and 
later increased 5 per cent. What is his present salary? 

28 . City A increased in population from 25,750 to 35,890 in a decade, 
and City B increased from 255,000 to 350,000 during the same decade. 
Which city had the greater percentage increase? 

29 . The general price level rose 80 per cent, then declined 33i per cent. 
How much was it then above its starting point? 

30 . A man whose salary was actually $275.00 a month was reported 
to be receiving $300.00 a month. What was the percentage error in the 
report? 

31 . A teacher’s salary of $200 a month was decreased 25 per cent and 
then increased 25 per cent. What is his present salary? 

32 . The value of the exports from the United States to a neighboring 
country in 1934 was 15 per cent less than the value in 1933, but in 1935 
was 10 per cent greater than the value in 1934. Compare the value in 
1935 with that in 1933. 

33. The number of registered passenger automobiles in the United 
States in 1929 was 2.3122(10 7 ) and the estimated population the same 
year was 1.22(10 8 ). What was the estimated number of people for each 
passenger automobile? What was the relative error in the estimate? 



Chapter 2 

TABULAR AND GRAPHICAL REPRESENTATION: 

FREQUENCY DISTRIBUTIONS 

11. INTRODUCTION 

Almost without exception the object of a statistical analysis is to 
form a judgment of a very large universe by means of a study of a 
small part of it. The large universe we call the parent population , l and 
the part of it that we use as a basis for generalization we call a 
sample . 

In some cases it is impossible to measure the entire parent popula- 
tion and in other cases it is impracticable to do so. Suppose a 
physician was interested in the blood pressure of American men 
between thirty and thirty-one years of age. He could never expect 
to get complete data for all the men in the parent population. Not 
only would it be impossible; it would be unnecessary, expensive, 
and a waste of time and energy. An excellent judgment could be 
made by the study of a properly selected sample. 2 

Our first task therefore is to secure the data of a properly selected 
sample, and then proceed to the analysis. The analysis will give us a 
summarized numerical description of the sample from which we may, 
if we desire, form certain judgments of the parent population. 

12. CLASSIFICATION OF THE DATA 

When a mass of data has been assembled it is necessary to classify 
the material in some compact and orderly form before it can be 
effectively analyzed. This procedure is known by statisticians as 
tabulation. It is merely the arrangement of the data into tables, or 
in a tabular form. The data in the original form are ungrouped; 
when they are summarized into a table they are grouped . 

The following table, Table 6, gives the scores made in college 

1 Statistically speaking, any mass of data is a population. 

9 Chapter 13 will deal more specifically with the problem of sampling. 

23 



24 TABULAR AND GRAPHIC PRESENTATION 

algebra by 125 first-year students at Bucknell University. These 
scores constitute a sample selected at random from a larger popula- 
tion. The grades are given to the nearest integer on the centigrade 
scale. This means, we recall, that a grade recorded as 92 might 
represent any mark between 91.5 and 92.5. We note that the 
lowest recorded score is 48 and the highest score is 97, giving a 
range of 97 — 48 = 49. The possible range is from 47.5 to 97.5, or 50. 

Table 6 . Semester Grades of 125 Students in College Algebra 
at Bucknell University 
(Grades recorded to the nearest integer) 


93 

83 

77 

75 

70 

88 

69 

68 

71 

63 

86 

58 

53 

50 

95 

79 

89 

87 

84 

78 

82 

81 

78 

81 

74 

80 

75 

! 76 

77 

75 

73 

48 

76 

69 

55 

74 

| 62 

95 

90 

84 

75 

87 

65 

70 

68 

76 

70 

55 

63 

79 

65 

80 

97 

91 

64 

68 

70 

79 

86 

83 

80 

57 

60 

65 

79 

80 

76 

82 

75 

60 

75 

77 

62 

59 . 

92 

85 

73 

74 

77 

70 

68 

65 

70 

72 

69 

90 

85 

85. 

81 

80 

77 

67 

66 

67 

63 

77 

73 

74 

75 

73 

69 

81 

80 

72 

72 

85 

82 

77 

73 

73 

74 

74 

75 

72 

70 

71 

75 

76 

76 

77 

74 

75 

71 

70 

72 


If these grades in college algebra are arranged in the order of 
magnitude the array will be more suitable for study than in the 
haphazard arrangement in Table 6, yet even the grades arranged 



CLASSIFICATION OF THE DATA 


25 


in this manner will still be unwieldy for a close analysis. A really 
compact form may be obtained by arranging the measures into 
classes of equal width, for example, 47.5-52.5, 52.5-57.5, etc., wherein 
the class interval or class width is 5 points. The number of items ot 
measures occurring in each class (called the class frequency) is then 
determined by tallying. 

The traditional method of tallying is to record the frequencies by 
marks until four have been made, then to make a cross mark for the 
fifth score. This procedure makes up the preliminary sheet. 

The procedure described above for tallying offers no facilities for 
checking. If a repetition of the classification leads to a different 
result, we have no means of tracing the error. If the number of 
observations is large, it is better to enter the values on cards, one 
card to each measure, then sort the cards into the classes we desire. 
We can then check each pack, thereby placing each measure in the 
proper class. 

The tabular arrangement — illustrated by Table 7 — consisting 
of a series of classes and a corresponding set of frequencies is called a 
simple frequency distribution. We designate the total frequency by N. 


Table 7. Semester Grades of 125 Students in College Algebra 
Preliminary Sheet 


Class 

Tally 

F requency 

92.5-97.5 

//// , 

4 

87.5-92.5 

mi / 

6 

82.5-87.5 

mi ml u 

12 

77.5-82.5 

m mm till 

19 

72.5-77.5 

nnWMW 
numm // ,,, 

37 

67.5-72.5 

mi m mi mi //// 

24 

62.5-67.5 

rm mi / 

11 

57.5-62.5 

mi / 

6 

52.5-57.5 

//// 

4 

47.5-52.5 

n 

2 

Total 


125 = N 


The organization of the data has thus been effected and the data 
are now prepared for the next step, the analysis. 



26 TABULAR AND GRAPHIC PRESENTATION 

Table 8. Semester Grades of 125 Students in College Algebra 
(Grades recorded to the nearest integer) 


Form (a) Form (b) 


Class 

Class Mark 
X 

Frequency 

AX) 

Class Mark 
X 

Frequency 

AX) 

92.5-97.5 

95 

4 

95 

4 

87.5-92.5 

90 

6 

90 

6 

82.5-87.5 

85 

12 

85 

12 

77.5-82.5 

80 

19 

80 

19 

72.5-77.5 

75 

37 

75 

37 

67.5-72.5 

70 

24 

70 

24 

62.5-67.5 

65 

11 

65 

11 

57.5-62.5 

60 

6 

60 

6 

52.5-57.5 

55 

4 

55 

4 

47.5-52.5 

50 

2 

50 

2 

Total 


125 = N 

Total 

125 = N 


In the preparation of Table 8 we were cognizant that the data are 
continuous and are recorded to the nearest integer. A score recorded 
as 79, for example, really fell somewhere over the interval 78.5 to 

79.5. Consequently we found it convenient to represent the end 
values of the class intervals to tenths. If the data had been recorded 
to tenths, we could have expressed the two figures defining each class 
to hundredths. 

The two figures that define a class are called the class limits of 
the class. In some tabular representation of classes, the defining 
numbers of the class are true class limits or class boundaries. 1 The 
class boundaries can easily be determined as each boundary is half 
way between the largest item in the lower class and the smallest item in 
the next higher class. Thus in Form (a) above the largest measure 
in the lowest class is 52 and the smallest value in the next higher class 
is 53. The class boundary is half way between 52 and 53, that is at 

52.5. The other boundary points in Form (a) can be found in a 
similar manner. 

The difference between the lower boundary of one class and the 
lower boundary of the next higher class is the class interval or class 
width. The class interval is also the difference between the upper 


1 Some authors call class boundaries closed class limits. 



GLASSIFICATION OF THE DATA 


27 


boundaries of two adjacent classes. The upper boundary of one class 
is the lower boundary of the next higher class, and the lower boundary 
of one class is the upper boundary of the next lower class. That is, 
for continuous data adjacent classes should “join up” or be con- 
tiguous. The number half way between the upper and lower bound- 
aries of a class is the class mark. Thus 

Class mark — ^PP er boundary + Lower bou ndary 

2 

A class boundary is half way between the class marks of two adjacent 
classes. The class boundaries of a class can be found by adding to 
and subtracting from the class mark one half the class width. With 
this in mind, Form (b) is a mere abridgment of Form (a). 

Form (a), using class boundaries, is a widely used method of 
indicating the classes of a simple frequency distribution. It is 
suitable to discrete as well as to continuous data, and we recommend 
it as our favorite method. However other methods for defining the 
classes are found in the literature of the subject. We shall present 
and discuss some well known forms to which the data of Table 8 may 
be applied. 


Form (c) 


Class 

Class 

Boundaries 

Class Mark 
Continuous 

Class Mark 
Discrete 

93 a.u. 98 

92.5-97.5 

95 

95 

88 a.u. 93 

87.5-92.5 

90 

90 

etc. 

etc. 

etc. 

etc. 


In Form (c), “93 a.u. 98” means “93 and under 98.” That is, 
in this class are found the measures as large as 93 but less than 98. 
The classes in Form (c) are defined by class limits but not by class 
boundaries. For clearness, we give the class boundaries which in 
turn assist us in finding the class marks. Form (c) is suitable for 
continuous and discrete data, but in using this form the student must 
recall that a score of 93 means any number in the interval 92.5 to 
93.5 and thus the lower boundary is 92.5. Similarly, a score of 88 has 
a lower boundary at 87.5. The class marks are now easily determined. 

Occasionally the classes are denoted by the smallest and largest 



28 


TABULAR AND GRAPHIC PRESENTATION 


measures of a given class, and the class interval may appear to range 
from the smallest to the largest measurement for each class. For 
continuous variates, this method of defining the class does not show 
the full range of the class and leaves gaps at the ends of the class. 
In this, as in all forms of class representation, the statistician must 
ascribe to each class the true class limits or the class boundaries , and 


Form (d) 


Class 

Class 

Boundaries 

Class Mark 
Continuous 

Class Mark 
Discrete 

93-97 

92.5-97.5 

95 

95 

88-92 

87.5-92.5 

90 

90 

etc. 

etc. 

etc. 

etc. 


the true class mark. Thus in Form (d) in the given classes we indicate 
the class limits by the smallest and largest values that may fall in a 
given class. We have included, for emphasis and for clearness, the 
class boundaries. 

Occasionally we find in the literature a tabular representation 
similar to Form (e). This form states ambiguously what Form (c) 
states more definitely. It is unsafe for tallying scores for the reason 


Form (e) 


Class 

Class 

Boundaries 

Class Mark 
Continuous 

Class Mark 
Discrete 

93-98 

92.5-97.5 

95 

95 

88-93 

87.5-92.5 

90 

90 

etc. 

etc. 

etc. 

etc. 


that it is easy to mis-tally boundary scores. Thus, to which class 
would a score of 93 belong? Again, we have included the class bound- 
aries for sake of clearness, also the class marks. 

In later chapters we shall find it necessary to locate certain division 
points on the X-scale: quartiles, deciles, percentiles. To find these 
points we shall need true class limits or class boundaries. 

The determination of true class marks is also very important as 



CLASSIFICATION OF THE DATA 29 

4 

many of our statistical constants, such as the arithmetic mean and 
the standard deviation, are found from the class marks of the classes. 
In fact to save labor in computation, we shall find it necessary to 
assume that the items are uniformly distributed over the given 
intervals and that the class frequencies are concentrated at the 
class marks. 

From this discussion of the several forms it is evident that, inas- 
much as the class boundaries must eventually be found to aid in the 
analysis of the data, we can save ourselves confusion and time by 
adopting class boundaries in the beginning of our problem. This pro- 
cedure we have followed and it is one we highly recommend. 

It should be emphasized that when a score is tabulated in the 
proper class interval, it loses its identity. Of course it falls somewhere 
within the boundaries of the interval, but in computation we do not 
use it again. For computational purposes in effecting the numerical 
analysis, it is necessary that we concentrate the class frequency at 
the mid-point of the class interval. Thus, in our computations on 
Table 8, we replace the scores 93, 95, 97, 95 of the class 92.5 — 97.5 
by four scores each of value 95, the mid-value of the class. Similarly, 
we replace the scores 88, 90, 89, 90, 91, 92 of the class 87.5 — 92.5 
by six scores each of value 90, the mid-value of the class. And so 
on for the other classes. 

While our assumption that the scores are evenly distributed over 
the interval is seldom verified by observed data, yet if the sample is 
sufficiently numerous the assumption leads only to a very slight error. 
Some such assumption must be made, and experience and statistical 
theory recommend the assumptions of evenness of measures over 
the interval and the concentration of the class frequency at the mid- 
point of the class interval. 

Example 1 . If 10 scores in integral variates are evenly distributed over 
the interval 72.5 — 77.5, what are the scores? 

Since the scores are integers, 10 in number, and must be evenly dis- 
tributed over the interval, they would have the values 73, 73, 74, 74, 75, 
75, 76, 76, 77, 77. 

Example 2. Are the values 73, 73, 73, 73, 73, 77, 77, 77, 77, 77 evenly 
distributed over the interval 72.5 — 77.5? 

No. While the statistical results are essentially the same as if the entire 
10 scores are situated at the class mark, 75, these values are not evenly 
distributed over the given interval. 



30 


TABULAR AND GRAPHIC PRESENTATION 


Example 3. If 20 measurements, rounded to the nearest half-inch, are 
evenly distributed over the interval 72.25 — 82.25, what are their values? 

Their values are: 72.5, 73.0, 73.5, 74.0, . . ., 82.0. 

Would two of each of the following 10 measurements be satisfactory: 
73, 74, 75, . . ., 82? 

Example 4. What measurements would satisfy for the preceding example 
if the interval were 72 a.u. 82? What is the upper boundary of the class? 

The values would be: 72.0, 72.5, 73.0, . . 81.0, 81.5. The largest 

value in the class is 81.5 and the smallest value in the next higher class is 
82.0. The class boundary is the value half way between them, namely 
81.75. 

EXERCISES 

1. Suppose the data are dinner checks from a cafeteria. Show that two 
checks for each of the values 93, 94, 95, 96, 97 cents would give the same 
total as 10 checks of 95 cents each. 

2 . Suppose the temperature at Lewisburg is recorded to the nearest tenth 
of a degree and that a 5 degree class interval has been selected. If the class 
limits are 60.0 — 64.9, 55.0 — 59.9, 50.0 — 54.9, etc., what are the class 
boundaries and the class marks of the three classes ? 

3. A group of intelligence quotients (continuous data) are arranged with 
the class intervals as follows: 75 — 79, 80 — 84, etc. What are the class 
boundaries and the class marks? 

4. What values are contained in the interval 75 — 79 if the data are 
discrete? If the data are continuous and recorded to the nearest integer? 

13. THE CHOICE OF THE CLASS INTERVAL 

In the choice of a class interval, the following brief suggestions may 
be helpful: 

1. The number of classes should, in general, not be less than 10 nor 
more than 30, seldom more than 25. 

2. If possible, the class intervals should be uniform in width. 

3. In general there should be no class intervals without definite limits. 
Intervals of the type “all over” and “all under” are to be avoided 
when possible. 1 

4. To facilitate computation, class intervals of multiples of 5 or 10 
are convenient. 


14. CLASS LIMITS 

The lowest limit of the lowest class may be chosen in many posi- 
tions. This choice and that of the class interval will practically 

1 Many of the tables found in the data sent out by the United States Govern- 
ment are of this type. 



CLASS LIMITS 


31 


determine the limits of the other classes. We rather hesitate to 
state many rules for their selection; much must be left to the judg- 
ment and resourcefulness of the student. The following suggestions 
should prove helpful: 

1. To facilitate computation, the mid-points should be integers. We 
shall find that carrying out this suggestion is frequently impossible. 

2. Certain types of data are loaded at special points. For example, 
college marks on a centigrade scale are loaded at 60, 65, 70, 75, etc. 
Distributions in which age is the independent variable are usually 
loaded at 20, 25, 30, etc. When the data display such a peculiarity, 
these loaded points should be chosen as mid-points of the class in- 
tervals. This is especially to be kept in mind, since in the analysis 
of our distributions we shall assume that all measures of a class are 
concentrated at the mid-point of the class. 

3. Class limits should be unambiguous and mutually exclusive. 

The class limits can be decided accurately only when the accuracy 
of the data is known. A score for either type of variate is assumed to 
extend from half a unit in the last place of measurement below to half 
a unit above the entry recorded. If the data are accurate to tenths, 
the class limits should be expressed to hundredths; if heights are 
measured to the nearest quarter of an inch, the class limits should 
be arranged in eighths of an inch. 

In many of the exercises that follow in the text, it will not be 
possible to carry out the suggestion of the preceding paragraph, 
because the original observers were not meticulously careful to state 
the accuracy of the original measurements. When this is the case 
we shall have to make some reasonable assumptions and proceed 
along the line suggested by them. 

EXERCISES 

1. The following diagram for the first class of Form (a), Table 8, shows 
that the mid-point of the class (the class mark) is 95. Make similar dia- 
grams to explain the other Forms. 

Diagram 1 

92.5 93 di 95 96 97 9L5 

2. Suppose you were asked to construct a frequency table of the grades 
in Table 6 (7). 24) with the class marks at 97.5, 92.5, etc. and with a class 
interval of 5, what would you say? 



32 


TABULAR AND GRAPHIC PRESENTATION 


3. An educational research department recently sent the author a 
score card for some data (to the nearest integer). The Class column was 
marked thus: 0-4, 5-9, 10-14, etc. What are the class marks? the true 
class limits? 

4 . The daily wages of 100 men were recorded to the nearest cent. 
Complete the table finding the class boundaries and the class marks. 


Class 

Class Boundaries 

Class Mark 

f(X) 

$2.25-2.49 



5 

2.50-2.74 



11 

2.75-2.99 



23 

3.00-3.24 



29 

3.25-3.49 



17 

3.50-3.74 



9 

3.75-3.99 



6 

Total 


100 


6. The weights of 1,000 male students (in pounds) were recorded to the 
nearest half pound. Complete the table. 


Class Boundaries 

Class Mark 

f(X) 


105.25 

4 


115.25 

12 


125.25 

20 


etc. 

etc. 


6 . The heights of 1,000 male students (in inches) were recorded to the 
nearest tenth of inch. Complete the table. 


Class 

Class Boundaries 

Class Mark 

f(X) 

60.8 a.u. 62.8 


61.75 

1 

62.8 a.u. 64.8 


63.75 

3 

64.8 a.u. 66.8 


65.75 

11 

etc. 


etc. 

etc. 


7 . The ages at marriage of 100 women were distributed as shown in 
the table. Find the class boundaries. 



GLASS LIMITS 


33 


Class 

Class Boundaries 

Class Mark 

f(X) 

15-19 


17 

4 

20-24 


22 

28 

25-29 


27 

23 

etc. 


etc. 

etc. 


8. The number of pedicels per cluster of a certain plant resulted in the 
following distribution. Find the class boundaries. 


Class 

Class Boundaries 

Class Mark 

f(X) 

12-19 


15.5 

8 

20-27 


23.5 

52 

28-35 


31.5 

176 

etc. 


etc. 

etc. 


9. A distribution of heights of students (in centimeters) was arranged 
as follows: 


Height 
(< centimeters ) 

Class Mark 

/(-V) 

155-157 

156 

4 

158-160 

159 

8 

161-163 

162 

26 

etc. 

etc. 

etc. 


What are the class boundaries? 

Can you guess at the accuracy of the original measurements? 
Do you agree with the class marks? 


10. Suppose you are given 500 grades (in per cent) in English to dis- 
tribute. The lowest grade is 20% and the highest grade is 90%. You de- 
cide upon a class width of 5%. Which of the two groupings, A or B, would 
be preferable? 


A 


Class 

X 

f(X) 

19.5-24.5 

22 


24.5-29.5 

27 


etc. 

etc. 



Class 


X 


f(X) 


17.5- 22.5 

22.5- 27.5 
etc. 


20 

25 

etc. 


B 



34 TABULAR AND GRAPHIC PRESENTATION 

11. In his book, “The Fundamentals of Statistics, ” Professor L. L. 
Thurstone tabulates the scores made on an intelligence test by 140 fresh- 
men at Swarthmore College. His table follows: 


Scores 

Class Mark 

/(X) 

40-49 

45 

1 

50-59 

55 

5 

60-69 

65 

12 

70-79 

75 i 

21 

80-89 

85 ! 

23 

90-99 

95 

23 

100-109 

105 

25 

110-119 

115 

14 

120-129 

125 

11 

130-139 

135 

4 

140-149 

145 

1 

Total 


140 


What are the class boundaries? 

Using our assumptions, what are the class marks of the classes? 

What are the tacit assumptions that Professor Thurstone makes regard- 
ing the extreme scores in the classes? 


12. The following data pertain to the ages of unemployed male workers 
in Boston in 1930. Professor R. C. White in his “ Social Statistics,” 
page 215, takes the classes and the class marks as shown. 


Age {Years) 

Class Mark 

/(X) 

10-14 

12.5 


15-19 

17.5 


20-24 

22.5 


etc. 

etc. 



What are the class boundaries? 

Do you agree with the class marks? 

What assumptions does Professor White evidently make regarding the 
largest and smallest ages of a class? 

13. In his “Statistical Methods, Revised,” Professor F. C. Mills ex- 
hibits on page 105 a distribution of the weekly earnings of workers in open- 
hearth furnaces in the Pittsburgh district in 1935. A portion of the table 
is shown here. 



CLASS LIMITS 


55 


Class Interval 
{in dollars per week) 

Mid-point 

Frequency 

$0- 3.99 

2 

67 

4- 7.99 

6 

290 

8-11.99 

10 

437 

etc. 

etc. 

etc. 


What are Professor Mills* assumptions regarding the values that are 
placed in the given classes? 

According to our assumptions what would be the mid-points of the 
classes? 

14 . In Davies and Yoder, “Business Statistics/* pages 110 and 114 we 
find the following distribution: 


L\-L<1 

X 

f 

10-12 

ii 

3 

12-14 

L3 

15 

14-16 

15 1 

20 

16-18 

17 

10 

18-20 

19 

2 


What are the assumptions of the authors regarding the values that are 
placed in the several classes? 

According to our assumptions what would be the values of X? 

16 . In h is book, “Statistics for Students of Psychology and Educa- 
tion/* Professor Herbert Sorenson on page 43 exhibits a distribution of 
scores obtained on an objective test in educational psychology. Here is a 
portion of the table. 


Scores by Intervals 

X 

/ 

80-84 

82.5 

3 

75-79 

77.5 

5 

70-74 

72.5 

7 

etc. 

etc. 

etc. 


What are Professor Sorenson*s assumptions regarding the scores in the 
given intervals? (See page 44 of his text.) 

What are the class boundaries and the values of A" according to our 
assumptions? 



36 TABULAR AND GRAPHIC PRESENTATION 

16 . In his book, “The Mathematical Part of Elementary Statistics/ 7 
Professor B. H. Camp gives on page 8 a distribution of wage data, a 
portion of which we show here. What are Professor Camp’s assumptions 
regarding the scores in the given intervals? W hat are the class boundaries 
of the classes? 


Class 

Mid-value 

f 

$4.50-5.99 

5.245 

43 

6.00-7.49 

6.745 

99 

etc. 

etc. 

etc. 


The illustrations found in these Exercises certainly show that 
authorities differ in their interpretations of class limits, class marks, 
class boundaries, et cetera. In reading the literature of our field 
we must be alert, therefore, to the assumptions, either tacit or ex- 
pressed, that guide the procedure. Further, we must be charitable 
and seek to understand what are the assumptions that are guiding 
an author’s steps, and realize that there is more than one way of 
doing a simple statistical task. 

The problem we are discussing is simply this: what is meant by 
a recorded score of 74? of 74.6? of 74.67? We assume that if a score 
is recorded 74, its value is between 73.5 and 74.5; if it is recorded 
74.6, its value is between 74.55 and 74.65; and so on. On the con- 
trary, many statisticians assume that if a score is recorded 74, its 
value ranges from 74 to but not including 75; if a score is recorded 
74.6, it ranges from 74.6 to but not including 74.7 ; and so on. They 
also use another method of description. They assume that a recorded 
score of 74 ranges from 74 to 74.99; a recorded score of 74.6 ranges 
from 74.6 to 74.699; and so on. 

The mathematician, accustomed to rigor in his thinking, generally 
prefers our method of description, namely, that the classes be de- 
termined rigidly by class boundaries, whereas the worker in an applied 
field may be willing to sacrifice some rigor. This is one of the con- 
troversial questions in statistical procedure so let us not assume that 
we have the full truth. After all, it is not a matter of extreme im- 
portance whether the scores on an English test average 74.26% or 
74.16%. It is essential, however, that we impose refinements when 
the data warrant them. It is just as essential that we do not give 
a false impression of accuracy in our procedures. 



GRAPHICAL REPRESENTATION 


37 


15. GRAPHICAL REPRESENTATION 

When the data have been organized into a suitable table, they are 
now ready for the first step in the analysis, that of presenting the 
data graphically. Graphical presentations display outstanding facts 
and bring into bold relief relationships that otherwise would be 
difficult to comprehend or possibly would not be noted at all. A 
column of figures may overwhelm us; the same data in graphic form 
may tell an easily understood story. Relative quantities especially 
can be grasped through visual means with a comprehensiveness that 
is not possible by pure analysis. 

While the ultimate basis of graphical presentation is mathematical, 
yet the practical work of constructing the charts can be accomplished 
without a profound knowledge of the true mathematical basis. 
Charts and graphs, then, can enable us to discover simply and 
quickly many facts and mathematical relationships about numerical 
data without the use of more difficult methods of analysis. The 
careful statistician, however, will be very cautious to verify by the 
more precise methods of analysis the suggestions that he receives 
from the graph. 

It is not our intention to present in this book a detailed account 
of the many graphical procedures that are used today. We shall 
explain certain important principles of' graphic presentation and leave 
it to the reader who desires a more comprehensive knowledge to 
consult the excellent volumes that are accessible . 1 

16. GRAPHICAL REPRESENTATION OF FREQUENCY 
DISTRIBUTIONS 

Probably the best graphical representation of a simple frequency 
distribution is furnished by a column diagram or histogram. It is 
constructed by erecting upon the class intervals rectangles whose 
altitudes are proportional to the frequencies. Suitable scales must 
be chosen so that the graph of the data can be made to fit the data, 
be of sufficient size to be readily interpreted, and be of such propor- 
tions that it will be agreeable to our artistic tastes. The left-hand 
side of the first rectangle is plotted at the lower boundary of the 

1 Excellent references for graphical presentation are listed in the bibliography 
in Appendix A of this volume. 



38 


TABULAR AND GRAPHIC PRESENTATION 


lowest class and the right-hand side of the last rectangle is plotted at 
the upper boundary of the highest class. Chart 1 shows the histogram 
for the distribution of grades in college algebra previously tabulated 
in Table 8 (p. 26). 

The student will note that each rectangle contains an area that is 
proportional to and represents the frequency of the class and that 


Chart 1 



the total area equals the total frequency times the class width. If the 
class width is taken as the unit, the total area equals the total fre- 
quency. 

Another method of representing graphically a frequency distribu- 
tion is by what is called a frequency polygon. Its construction is 
very much like the plotting of curves and line diagrams in elementary 
algebra. In form (b) of Table 8 (p. 26), each pair of values X , 
f(X) y defines a point. Plotting the several points and connecting them 
by a broken line, we obtain the frequency polygon. The last points at 



FREQUENCY DISTRIBUTIONS S9 

either end must be joined to the base at the center of the next class 
interval. The observing student will note that the vertices of the 
frequency polygon are merely the mid-points of the tops of the rec- 
tangles of the histogram, and that the ordinates represent the frequen- 
cies. Chart 2 shows the frequency polygon for the grades in college 
algebra displayed in tabular form in Table 8 (p. 26). 

The fact that the polygon extends beyond the limits of the table 
suggests that if the grades of a larger group of students were taken, 


Chart 2 



a few would have been found with grades less than any in our sample 
and a few with grades larger than any in our sample. Both the 
histogram and the polygon show graphically the outstanding facts 
of the sample considered. If one is interested in the sample only, 
this representation is sufficient. However, the purpose of the 
investigation would usually be to answer certain questions regarding 
the larger group, the parent population , of all the grades at this 
institution in college algebra. The frequency polygon for the parent 
population would resemble very closely a smooth curve. 





40 TABULAR AND GRAPHIC PRESENTATION 

If the class interval be made smaller and smaller and the total 
frequency, N, be increased without limit, the limit approached by 
the histogram and the frequency polygon is termed a frequency curve . 
In Chart 1, a frequency curve has been drawn. 

It should be borne in mind that this frequency curve brings out the 
general tendencies of the parent population by means of what we have 
assumed to be a representative sample. This curve meets the base 
line near the same points at which the frequency polygon meets it; 
it rises, slowly at first, to a maximum, then recedes again to the base 
line. The curve should be so drawn that the total area under the 
curve is equal to the total area of the histogram. 

The graphical representation of the data of Table 9 brings out in 
bold relief the outstanding facts that would possibly not be noted by 
a glance at the table. Since our primary aim here is the comparison 
of the two sets of mortality rates, we shall superimpose them on the 
same graph sheet, Chart 3, so that they may be readily compared. 1 


Chart 3 



1 The reader will note that the age divisions of Chart 3 are unequal. 







FREQUENCY DISTRIBUTIONS 


41 


Table 9. Mortality Rates per 100,000 Population for Typhoid 
Fever in the Registration States, 1910 and 1920 1 


Age 

Mortality Rate 
in 1910 

Mortality Rate 
in 1920 

Under 1 

5.5 

1.1 

1 to 4 

11.9 

2.4 

5 to 9 

13.0 

3.5 

10 to 14 

16.6 

5.6 

15 to 19 

31.2 

8.5 

20 to 24 

37.1 

8.0 

25 to 34 

30.4 

6.2 

35 to 44 

22.1 

5.5 

45 to 54 

20.4 

4.7 

55 to 64 

18.5 

4.7 

65 to 74 

16.9 

4.7 

75 and on 

14.0 

2.0 


EXERCISES 


1. The lengths of a sample of 75 beans were measured to the nearest 
tenth of a centimeter. The results arc shown in the following distribu- 
tion: 

Distribution of Length of 75 Beans 


Length 

Class Mark 

X 

Frequency 

/(-V) 

1.45-1.55 

1.5 

2 

1.55-1.65 

1.6 

4 

1.65-1.75 

1.7 

6 

1.75-1.85 

1.8 

8 

1.85-1.95 

1.9 

12 

1.95-2.05 

2.0 

20 

2.05-2.15 

2.1 

11 

2.15-2.25 

2.2 

9 

2.25-2.35 

2.3 

2 

2.35-2.45 

2.4 

1 


Total 

75 


Draw the histogram for these data. Connect the mid-points of the tops 
of the rectangles, complete at the extremes as previously directed, and 
thereby obtain the frequency polygon. What is the total area of the 
histogram? of the polygon? 

1 Mortality Statistics , 1910-1920 , United States Bureau of the Census, p. 36. 



42 


TABULAR AND GRAPHIC PRESENTATION 


2 . If 1,024 throws are made with 10 coins, theoretically, the following 
results are “expected”: 


Theoretical Frequencies in Coin-Tossing 


Number of Heads 
Turning up 

X 

Frequency 

AX) 

Number of Heads 
Turning up 

X 

Frequency 

AX) 

0 

1 

6 

210 

1 

10 

7 

120 

2 

45 

8 

45 

3 

120 

9 

10 

4 

210 

10 

1 

5 

252 

Total 

1,024 


Draw the frequency polygon. 

This distribution, we observe, is symmetrical with respect to a vertical 
line drawn through the point (5, 0). While symmetrical distributions 
never occur in observed data, they are closely approximated in biological 
and anthropometric measurements. Many educational measurements 
also result in series that possess remarkable degrees of symmetry. 

3 . As another example of a series of discrete variates, consider the dis- 
tribution of the following table: 


Distribution of Rays in Tail Fins of 703 Flounders 1 


Number of 
Rays 

X 

Number of 
Flounders 
AX) 

Number of 
Rays 

X 

Number of 
Flounders 
f(X) 

47 

5 

55 

111 

48 

2 

56 

74 

49 

13 

57 

37 

50 

23 

58 

16 

51 

58 

59 

4 

52 

96 

60 

2 

53 

134 

61 

1 

54 

127 

Total 

703 


Draw the histogram and a frequency curve for these data. 2 

1 Paul Riebesell, Biometrik und Variationsstatistik } p. 760. 

2 As with all discrete variates, this curve is defined only at the points deter- 
mined by the data. We draw the curve merely to emphasize the characteristics 
of the distribution. 



43 


FREQUENCY DISTRIBUTIONS 

4 . The data in the following table give the frequencies of the numbers 
of petals on a certain series of the plant named. They illustrate what is 
called the J-shaped distribution. 


Frequencies of Petal Numbers, 
Ranunculus Bulbosus 


Number of Petals 

Frequency 

X 

f(X) 

5 

133 

6 

55 

7 

23 

8 

7 

9 

2 

10 

2 

Total 

222 


Plot the histogram and the frequency curve. 

17. GRAPHICAL REPRESENTATION OF TEMPORAL 
DISTRIBUTIONS 

The distributions we have thus far considered have dealt mainly 
with biological and educational data. They have not generally 
been primarily related to time. The tabular representations have, 
in general, shown few members at the extremes but they have shown 
a comparatively large number in the central portions of the tables. 
The graphical representations, whether by histogram, polygon, or 
curve, have possessed a common description, namely, low at each 
end with a maximum near the center. We shall call such distribu- 


wide variation or dispersion , whereas others have shown moderate 
variation. Further, they have been more or less unsymmetrically 
distributed about any line or point. In other words, they have 
possessed a quality of asymmetry or skewness. The distribution of 
algebra grades seemed considerably peaked (leptokurtic) near the 
center. This quality of “ peakedness” (or “flatness”) is called kurtosis 
and excess. 1 

1 Kurtosis by the British school; excess by the Scandinavian school. 


is mound-shaped. 


Tons previously considered, some have shown i 



44 


TABULAR AND GRAPHIC PRESENTATION 


In the chapters that follow we shall develop measures of these 
qualities of the distributions. Our present task is the organization 
and the graphical representation of the data; our next problem will 
be its algebraical and arithmetical analysis. 

Another type of distribution frequently encountered in dealing 
with economic and mortality data is that in which time is the inde- 
pendent variable. Such dist ributions are called temporal dis tribu - 
lions or time series . - 

r ~ We shall note that time series display a number of distinct types 
of movement such as long-time trends, seasonal variation, cyclical 
movements, etcetera. These types of movement call for close ex- 
amination. 

As a first example, consider the growth of population of the United 
States from 1790 to 1930 inclusive. 

Table 10. Population: Continental United States, 1790-1930 1 


Census 

Year 

Population 
( thousands ) 

Per Cent of 
Increase over 
Preceding Census 

Census 

Year 

Population 

(thousands) 

Per Cent of 
Increase over 
Preceding Census 

1790 

3,929 



1870 

38,558 

26.6* 

1800 

5,308 

35.1 

1880 

50,156 

26.0* 

1810 

7,240 

36.4 

1890 

62,948 

25.5 

1820 

9,638 

33.1 

1900 

75,995 

20.7 

1830 

12,866 

33.5 

1910 

91,972 

21.0 

1840 

17,069 

32.7 

1920 

105,711 

14.9 

1850 

23,192 

35.9 

1930 

122,725 

16.1 

1860 

31,443 

35.6 





* Estimated rates are given here. 

The graphical representation of these data is shown in Chart 4. 
We note that the population has enjoyed a steady growth. From 
1790 to 1860 each census increased approximately one-third, usually 
somewhat more, over the preceding; from 1860 to 1890 the decade 
rates of growth were somewhat over one-fourth, and from 1890 to 
1910 a little over one-fifth. Since 1910 the decade rates of increase 
have been about 15 per cent. Hence we see that, whereas the popula- 
tion has steadily increased, the rate of increase has been steadily 
decreasing. 

1 The data are taken from the Fifteenth Census of the United States , Bureau 
of the Census, Vol. I, Population, p. 6. 



TEMPORAL DISTRIBUTIONS 


45 


Chart 4 



As a second example of time series, consider the following table 
which gives the production of lumber in the United States in billions 
of board feet for the given years. 


Table 11. Lumber Production in the United States 1 


Year 

Reported Production 
( billions of board feet) 

Year 

Reported Production 
( billions of board feet) 

1909 

44.5 

1916 

39.9 

1910 

40.0 

1917 

35.8 

1911 

37.0 

1918 

31.9 

1912 

39.2 

1919 

34.6 

1913 

38.4 

1920 

33.8 

1914 

37.3 

1921 

27.0 

1915 

37.0 

1922 

31.6 


1 The data are taken from Statistical Abstract of the United States , 1928, p. 689. 







46 


TABULAR AND GRAPHIC PRESENTATION 


These data are represented graphically in Chart 5. This is a typical 
diagram for a historical series. The broken line which represents the 
production oscillates back and forth on either side of the line of trend 
which we have estimated graphically. All the points for the produc- 
tion polygon lie within a comparatively narrow strip of which the 
trend line is the center. Both the trend line and tho production 
polygon emphasize the general diminishing of the production during 
the years in question. In a later chapter we shall discuss methods 
for a closer analysis of these data. 

Chart 5 affords a good illustration of the possibilities of omitting 
unimportant areas. In order to give emphasis to the main facts of 
the data, we obey the instructions of the Joint Committee on Stand- 
ards for Graphic Presentation 1 to the effect that the zero line should 
be shown by the use of a horizontal break in the diagram. 


Chart 5 



1 A. C. Haskell, How to Make and Use Graphic Charts f 1919, p. 71. 





TEMPORAL DISTRIBUTIONS 47 

The recommendations of this committee should be observed when a 
single set of data is exhibited. It may not be advisable to carry out 
the recommendations when two sets of data are placed upon the same 
graph sheet. We found it possible to do this on Chart 5, but for the 
data in Table 12, though it is possible, it is inadvisable. 

One purpose of a graph is to emphasize outstanding facts, to make 
evident outstanding relationships. To accomplish this, the proper 
scales must be selected. The selection of the scales that will give due 
emphasis to the facts and relationships may not be the scales such 
that the zero lines for both sets of data can be shown on the diagram. 

Consider the data of Table 12. Here we freely omit, without 
confusing the figure, the zero lines for both sets of data. This table 
gives the quantity of beef available for consumption per capita per 

Table 12. Beef: Quantity Available per Capita per Annum 
Steers: Price per Hundredweight in Dollars 1 


1 

Year 

Beef Available 
( pounds ) 

Price per Cwt. 

( dollars ) 

Year 

Beef Available 
( pounds ) 

Price per Cwt. 

( dollars ) 

1902 

68.5 

7.47 

1910 

71.1 

7.77 

1903 

76.0 

5.57 

1911 

67.7 

7.23 

1904 

73.6 

5.96 

1912 

61.1 

9.36 

1905 

73.0 

5.97 

1913 

60.6 

8.93 

1906 

72.6 

6.13 

1914 

58.5 

9.65 

1907 

77.5 

6.54 

1915 

| 54.5 

9.31 

1908 

71.5 

6.82 

1916 

56.0 

10.42 

1909 

75.4 

7.34 





annum, and the wholesale price of steers per 100 pounds for the 
given years. 

The graphical representation of these data is found in Chart 6. 
During this fifteen-year period we note that the quantity of beef 
available per capita per annum has generally decreased, whereas the 
wholesale price per hundredweight has almost steadily increased. 
That is, the general trend of the quantity available has shown a 
downward trend whereas the price has shown an upward trend. 
The trend for the quantity available seems to be curvilinear, while 
that for the price seems to be linear. These trends and their relation- 
ships will be further analyzed in Chapter 8. 

1 The data are taken from Yearbook of Agriculture , 1928, p. 962; United State* 
Bureau of Labor Statistics, Bulletin No. 335, p. 38. 



48 


TABULAR AND GRAPHIC PRESENTATION 
Chart 6 



18. CUMULATIVE DISTRIBUTIONS AND CURVES 

Frequently the chief interest in a frequency distribution is not so 
much in the items as they are distributed in the several classes as in 
the accumulated totals of certain of the classes. We may, for ex- 
ample, be chiefly interested in the number of students who receive 
“more than” or “less than” a given mark; in the number of em- 
ployees who receive “more than” or “less than” a given wage; in 
the number of families who receive “more than” or “less than” a 
given income. 

We are thus led to a discussion of cumulative distributions and to 
their graphical representations, known as cumulative curves } 

Consider, for example, the distribution of Table 13, which illus- 
trates the formation of a “less than” distribution. The column 
denoted by Cum . f(X) gives us the number of the given sample who 
receive an income less than a given amount, and the column denoted 
1 The cumulative curve is sometimes called an ogive. 








CUMULATIVE DISTRIBUTIONS AND CURVES 49 

Table 13. Distribution of the Estimated Income among 
Unmarried Women of the United States in 1910 1 


Income 
( dollars ) 

Number 

KX) 

Income 
less than 
( dollars ) 

Cum. f(X) 

Cum.f( [X) 

N 

0- 200 

10 . 

200 

10 

0.006 

200- 300 

70 

300 

80 

0.044 

300- 400 

560 

400 

640 

0.354 

400- 500 

530 

500 

1,170 

0.646 

500- 600 

280 

600 

1,450 

0.801 

600- 700 

150 

700 

1,600 

0.884 

700- 800 

110 

800 

1,710 

0.945 

800- 900 

37 

900 

1,747 

0.965 

900-1,000 

22 

1,000 

1,769 

0.977 

1,000-1,100 

16 

1,100 

1,785 

0.986 

1,100-1,200 

12 

1,200 

1,797 

0.993 

1,200-1,300 

8 

1,300 

1,805 

0.997 

1,300-1,400 

5 

1,400 

1,810 

1.000 

Total 

1,810 





by enables us to note the per cent of the total frequency, 

N, who receive less than a given amount. Thus, of the 1,810 in- 
comes considered in the sample, 640 or 35 per cent received less than 
$400; 1,600 or 88 per cent received less than $700; 1,769 or 98 per 
cent received less than $1,000. 

The diagram for the cumulative distribution of Table 13 is con- 
structed by plotting the points (200, 10), (300, 80), etc., as in elemen- 
tary algebra, and joining them by a broken line as in Chart 7. If a 
smooth curve be drawn through the points plotted we have a cumu- 
lative curve. The graph of — is precisely coincident with 

that of Cum . f(X) if a proper scale is used for the ordinates. 

The cumulative curve is useful for the process of interpolation, 
that is, for estimating values between those given in the table. 
Suppose, for example, we desire to know the income such that half 
of the 1,810 have incomes less than it and half have larger incomes. 
Such an income is called the median 2 income. 

1 W. I. King, Wealth and Income of the People of the United States , 1915, p.224. 

2 The median will be discussed in Chapter 3. 



50 


TABULAR AND GRAPHIC PRESENTATION 


Chart 7 



We have: 


N = 1,810 



Hence our question is: What is the income when the cumulative 
number of incomes is 905? 

Mark the point 905 on the vertical scale. Draw through this 
point a horizontal line which meets the cumulative polygon at A. 
Draw through A a vertical line which meets the horizontal axis at B 
for which the income is about $450. 

We can check this by simple proportion. From Table 13, we 
have: 






CUMULATIVE DISTRIBUTIONS AND CURVES 


51 


Income 
less than 


Cum. f(X) 


100 


->$400 

640<- 

-♦Median = 400 + x 

905<- 

->$500 

1,170*- 

a; 265 


100 530 


x — $50 



530 


Median = $400 + x = $450 (Approx.) 


In general the proportion is written : 

Partial difference in 1st column __ Partial diff erence in 2nd column 
Total difference in 1st column Total difference in 2nd column 


EXERCISE 

Estimate from Chart 7 the income such that it is exceeded by exactly 
three-fourths of the 1,810 incomes. Check your estimate by algebraical 
interpolation. (You should secure about 1367 for your result.) 

In a manner similar to the formation of the “less than 7 ' distribu- 
tion we may form a “more than 77 distribution. While the “less 
than 77 distribution proceeds from the least variates and refers to the 
upper limits of the classes, the “more than 77 distribution proceeds 
from the greatest variates to the least and refers to the lower limits 
of the classes. 


19. TYPES OF FREQUENCY CURVES 

As the student proceeds with the graphical representation of 
frequency distributions he will be impressed with the fact that the 
graphs of data, even when collected from widely different fields, 
show certain common characteristics, and can therefore be described 
as belonging to certain general types; in fact, many of the frequency 
curves can be closely represented by equations. 

The problem of representing the several types of frequency dis- 
tributions by equations that will best fit the data belongs to the 
field of advanced statistics, and we desire merely to suggest it at this 
point. We do, however, wish to describe briefly the general types of 
frequency curves that are most common. 



52 TABULAR AND GRAPHIC PRESENTATION 

By far the most common of all is the moderately asymmetrical or 
mound-shaped distribution. It occurs in data collected from many 
fields, such as education, psychology, sociology, economics, biology. 
The frequencies of this general type increase more or less regularly 
up to a maximum, and then decrease in the same way. We also 
note in this type a piling up of cases near the center, that is, a central 
tendency. On Chart 8 we illustrate this type by curves ai and a 2 . 


Chart 8 



The second type we may name the symmetrical distribution , in 
which the frequencies decrease uniformly on either side of a line 
through the center. It is frequently approached in form by data 
derived from coin- and dice-throwing experiments; from errors of 
observation in physical measurements; from biological measure- 
ments; from educational measurements. The graph for this second 
type is illustrated by curve b on Chart 8. 

Many writers call this second type a normal or bell-shaped distribu - 






TYPES OF FREQUENCY CURVES 53 

lion. We much prefer to reserve the name normal for the special 
symmetrical distribution whose equation is 

y = Ce~ h7x * 

and which we shall discuss rather fully in Chapter 12. There are 
many equations for the curve of the general form we are describing 
here (see Exercises 6, 7, 8, and 9 at the end of this chapter) but a 
curve is normal only when its equation has the above form. 

The third type is the J -shaped distribution in which the frequency 
constantly increases or constantly decreases. The greatest frequency 
is at one end of the distribution or the other. It is illustrated by 
curve c on Chart 8. [See Exercise 4, page 43.] 

The fourth type we shall mention is the U -shaped distribution 
(see curve d, Chart 8). It is rarely met. The stock example, familiar 
to statisticians, is given in Exorcise 11 at the end of this chapter. 

20. SUGGESTIONS FOR TABULAR AND GRAPHIC 
PRESENTATION 

The process of arranging data into columns and rows in an orderly 
manner is called tabulation. The essential characteristics of a good 
tabulation are clearness and compactness. While no hard and fast 
rules can be given to cover all cases of table construction, the follow- 
ing suggestions may be found helpful: 1 

1. The table should have a clear and concise title. 

2. The columns and the rows should be arranged in an order that will 
facilitate comparisons. 

3. The columns should have concise headings stating the units of 
measurement when necessary. 

4. The forms should be set off by double lines at the top and the 
bottom, the sides remaining open. 2 

5. The totals may be placed above or below the detail which thev 
summate. 

6. If possible, the source of the data should be given. 

In the construction of the charts, we should note especially that: 

1. A boundary (picture frame) improves the appearance of the picture. 

2. A clear title and subtitle should be in evidence. 

1 A splendid treatment of tabular representation is found in Horace Secrist, 
An Introduction to Statistical Methods , rev. ed., 1925, Chap. VI. 

2 A narrow, compact table may have side lines. 



54 


TABULAR AND GRAPHIC PRESENTATION 


3. The scales should be so selected that the main facts are given due 
emphasis. 

4. The horizontal and vertical scales, with suitable captions, should 
be easily interpreted. 

EXERCISES 

1. The following table gives the distributions of heights and weights of 
1,515 first-year university men. What are the class boundaries of the 
several classes in each distribution? Construct the histograms and fre- 
quency curves for these distributions. To what general types do these 
distributions belong? 

Distribution of Heights and Weights of 1,515 Men 

(a) . (b) 

Heights in Inches Weights in Pounds 


Class Mark 

X 

Frequency 

f(X) 

58 

2 

59 

1 

60 

7 

61 

10 

62 

26 

63 

40 

64 

74 

65 

142 

66 

220 

67 

230 

68 

258 

69 

231 

70 

118 

71 

99 

72 

38 

73 

15 

74 

2 

75 

1 

76 

0 

77 

1 

Total 

1,515 


Class Mark 

X 

1 

Frequency 

/(X) 

95.5 | 

5 

105.5 

34 

115.5 

139 

125.5 

300 

135.5 

367 

145.5 

319 

155.5 

205 

165.5 

76 

175.5 

43 

185.5 

16 

195.5 

3 

205.5 

4 

215.5 

3 

225.5 

1 

Total 

1,515 


*2. The following table gives the distribution of head-breadths of 1,000 
Cambridge men, the measurements being taken to the nearest tenth of an 
inch. Draw the histogram and the frequency polygon for these data. 
What is the general type of this distribution? Find the class boundaries. 



SUGGESTIONS FOR PRESENTATION 


55 


Distribution of Head-Breadths of 1,000 Men 1 


Class Mark 

X 

Frequency 

f(X) 

5.5 

3 

5.6 

12 

5.7 

43 

5.8 

80 

5.9 

131 

6.0 

236 

6.1 

185 

6.2 

142 

6.3 

99 

6.4 

37 

6.5 

15 

6.6 

12 

6.7 

3 

6.8 

2 

Total 

1,000 


3. In the following table, the average price per bushel is that received 
by producers December 1. 


Average Yield and Average Price of Wheat, 1919-1928 2 


Year 

Average Yield 
per Acre 
( bushels ) 

Average Price 
per Bushel 
(cents) 

1919 

12.8 

214.9 

1920 

13.6 

143.7 

1921 

12.8 

92.6 

1922 

13.9 

100.7 

1923 

13.4 

92.3 

1924 

16.5 

129.9 

1925 

12.9 

141.6 

1926 

14.8 i 

119.8 

1927 

14.9 

111.5 

1928 

15.6 

97.2 


Make a chart of these data representing both the average yield and 4 the 
average price on the same diagram. What can you say about the trends? 

1 The data are taken from Biometrika, V ol. I, p. 220. 

1 The data are taken from Yearbook of Agriculture , 1928, p. 670. 



56 


TABULAR AND GRAPHIC PRESENTATION 


4. In the following table the average price per barrel is that of Baldwins 
at Boston. 

Total Production of Apples in the United States and Average 
Price, 1910-1926 1 


Year 

Production 
( millions of 
bushels) 

Price 
per Barrel 
( dollars ) 

1910 

142 

3.68 

1911 

214 

2.56 

1912 

235 

2.28 

1913 

145 

3.95 

1914 

253 

2.08 

1915 

230 

2.36 

1916 

194 

3.44 

1917 

167 

4.40 

1918 

170 

5.94 


Year 

Production 
( millions of 
bushels) 

Price 
per Barrel 
( dollars ) 

1919 

142 

6.71 

1920 

224 

4.02 

1921 

99 

6.69 

1922 

203 

4.84 

1923 

203 

4.02 

1924 

172 

4.78 

1925 

172 

3.92 

1926 

247 

3.22 


Make a chart of these data representing both the production and the 
price on the same diagram. Point out an important relationship that is 
emphasized by the graph. 

5. Make a chart of the data of the following table. Describe the trend. 


Divorces in the United States 2 


Year 

Number of Divorces 
( thousands ) 

Year 

Number of Divorces 
( thousands ) 

1890 

33.5 

1916 

112.0 

1895 

40.4 

1922 

148.8 

1900 

55.8 

1926 

180.0 

1905 

68.0 




Plot the curve for each of the following equations, and describe the 
general type to which each curve belongs. 


6 . y = 


100 

2 * + 2 


7. y « 


100 

z 2 + 2 


8. y = 50(2)"? 


9 . 



10. y 


-4 + f)'H)‘ 


1 Loc. cit. y p. 764. 

2 The data are taken from Statistical Abstract of the United States, 1930, 
p. 92. 



SUGGESTIONS FOR PRESENTATION 


57 


11 . Draw a histogram to represent the data of the following table. 

Frequencies in Days of Estimated Intensities 
of Cloudiness at Breslau, 1876-1885 


Cloudiness 

Frequency 

0 

751 

1 

179 

2 

107 

3 

69 

4 

46 

5 

9 

6 

21 

7 

71 

8 

194 

9 

117 

10 

2,089 

Total 

3,653 


12 . The following table gives the annual production of Portland Cement 
in the United States. Statistical Abstract of the United States, 1930, p. 785. 
Construct a broken-line diagram for these data. Is the general trend up- 
ward or downward? linear or curvilinear? 

Portland Cement Production 


Year 

Production 
(Millions of 
bai rets) 

, 

i 

Year 

Production 
(Millions of 
barrels) 

1910 

77 

1920 

100 

1911 

79 

1921 

99 

1912 

82 

1922 

115 

1913 

92 

1923 

137 

1914 

88 

1924 

149 

1915 

86 

1925 

161 

1916 

92 

1926 

165 

1917 

93 

1927 

173 

1918 

71 

1928 

176 

1919 

81 

1929 

171 


13 . The following table gives the annual production of cigarettes in 
the United States. Construct a broken-line diagram for these data. Is 
the trend of production linear or curvilinear? 



58 


TABULAR AND GRAPHIC PRESENTATION 

Cigarette Production 


Year 

Annual Production 
( Billions ) 

Year 

Annual Production 
( Billions ) 

1920 

47.4 

1925 

82.2 

1921 

52.1 

1926 

92.1 

1922 

55.8 

1927 

99.8 

1923 

66.7 

1928 

108.7 

1924 

72.7 

1929 

122.3 


CUMULATIVE REVIEW 

1. Name several fields of investigation that make use of the statistical 
method. 

2 . Name the four steps in the solution of a statistical problem and 
state briefly what each means. 

3 . What is meant by variation in statistical data? 

4 . Define continuous variates; discrete variates. Illustrate. 

5 . What is meant by “ error in a measurement”? The relative error 
in a measurement? Illustrate. 

6. Usually what is the object of a statistical analysis? 

7 . Distinguish between sample and parent population. 

8. Can you think of a problem in which the primary object is a sum- 
marized numerical description of the sample only? 

9* What letter do we use to designate the total frequency, 2/(X)? 

10 . In the terminology of the text, what do the symbols X, f(X), and 
N represent? 

11 . Give directions for constructing a histogram; a frequency polygon. 

12 . Prove: The total area of a histogram equals the total frequency, A, 
times the class width, w. That is, 

Area = uXf(X) — wN. 


13. What is an ogive? Mention several uses of the ogive. 



Chapter 3 

MEASURES OF CENTRAL TENDENCY 
21. INTRODUCTION 

It will be recalled that after the collection of the data the next step 
in the solution of a statistical problem is the organization of the data. 
The preceding chapter has been devoted to the problem of organi- 
zation of the data and its graphical analysis. This brings us to the 
third step, the numerical analysis of the data. We shall find it neces- 
sary to devote several chapters to this important part of our problem. 

The primary purpose (see Section 1) of a statistical analysis is to 
abstract the relevant information from a mass of numerical data and to 
express the results clearly and concisely. We accomplish this purpose 
by computing certain summarizing numbers, or averages , which are 
simply statistical constants , rigidly defined, and which are designed, 
as Professor Bowley says, “to enable the human mind to comprehend 
with a single effort the significance of the whole.’ ’ 

Averages may be used not only to give us a concise picture of a 
large group of numbers, but they may be used also to compare dif- 
ferent groups, to obtain important facts about a large universe (the 
parent population) from the measurements of a sample, to measure 
the relationship between different groups. 

The present chapter will be devoted to the averages which measure 
central tendency. 1 We shall give attention to five such measures: 
(1) the arithmetic mean, (2) the median, (3) the mode, (4) the 
geometric mean, and (5) the harmonic mean. 

While we shall not at this time undertake to judge the relative 
merits of these measures, we may with propriety mention- several 
criteria by which an average may be fairly judged. Yule has men- 
tioned several properties that an average should possess. 2 He says 
that an average (1) should be rigidly defined, (2) should be based on 
all the observations, (3) should be readily comprehensible, (4) should 
be easily computed, (5) should be affected as little as possible by the 

1 Averages that measure other characteristics will be discussed in succeeding 
chapters. 2 Yule and Kendall, op. cit., p. 113. 

59 



60 


MEASURES OF CENTRAL TENDENCY 


fluctuations due to sampling, 1 and, finally, (6) should lend itself readily 
to algebraic treatment. 


22. THE ARITHMETIC MEAN, AT* 


The arithmetic mean of a group of numbers, essentially measure- 
ments, is their sum divided by their number. For example, the 
arithmetic mean of the numbers 3, 5, 8, 13, 6 is given by: 


AM. 


3 + 5 + 8 + 13 + 6 
5 


In algebraic form, if X h X 2 , X 3 , . . . , Xn is a set of N variates, 
their arithmetic mean is given by: 


A/r X± + X 2 + ■ * • + Xn 2X 
Mx = = — (1) 

It may happen that many of the variates may be equal. Suppose 
the grades of 23 students on a certain test were: 96, 92, 92, 85, 85, 
85, 85, 76, 76, 76, 76, 76, 76, 65, 65, 65, 65, 60, 60, 60, 50, 50, 40. Of 
course we can find the arithmetic mean by the above formula and 
definition, but the arithmetic is simplified if we proceed as follows: 




96(1) + 92(2) + 85(4) + 76(6) + 65(4) + 60(3) + 50(2) +40(1) 

23 


1656 

23 


72 centigrade units (c.u.) 


The numbers, 1, 2, 4, 6, 4, 3, 2, 1 are the frequencies of the grades. 
We can show this arithmetic mean by simply arranging the above in 
tabular form, thus giving the frequency distribution. 


Table 14. Frequencies of Grades of 23 Students 


Grade 

X 

Frequency 

f ( X ) 

-W) 

96 

1 

96 

92 

2 

184 

85 

4 

340 

76 

6 

456 

65 

4 

260 

60 

3 

180 

50 

2 

100 

40 

1 

40 

Total 

23 

1,656 


1 See Section 37 for an explanation. 



THE ARITHMETIC MEAN 


61 


Mx 


1656 

23 


72 c.u. 


In general, suppose that X\ appears f(X i) times, that X 2 appears 
f(X 2 ) times, and so on, and that X n appears f(X n ) times, then evi- 
dently: 


M x 


2 Xif(Xi) 


i-l 


n 


2f(Xi) 

i = 1 


N 


( 2 ) 


where A = 2/(X) = the total frequency = the number of the 
measures. The table headings for formula (2) should be 


X 


f(X) 


Xf(X) 


as is illustrated by Table 14. 

At this point a few words about our notation are in order. We 
have indicated our measurements, scores, etc. by the upper case X. 
Since the arithmetic mean is generally called the mean, we may 
naturally represent the mean by M. The subscript X gives emphasis 
to the fact that we are averaging N values of X. If the original 
items are indicated by F, or by the corresponding means may be 
represented by My and Mz respectively. We shall find it necessary 
to use the subscript only when dealing with problems of theory or 
when we wish to emphasize what variable we arc averaging. Hence, 
in general we shall indicate the mean by M without the subscript. 

In the preceding chapter, when discussing the formation of fre- 
quency distributions, our attention was directed to two important 
assumptions that we must make regarding our data. We assume: 

1. That in any class the measures are uniformly distributed throughout 
the interval; 

2. That the frequency of the class may be concentrated at its mid-point. 

We shall see that in most cases the error due to grouping is relatively 
slight and that even this can frequently be adjusted by certain cor- 
rections. 1 

1 Sheppard’s Corrections, Section 43 of this volume. 



62 


MEASURES OF CENTRAL TENDENCY 


It is evident, as indicated above, that the use of formula (2) for 
computing M requires columns for X the class mark, for f{X) the 
frequency, and for Xf(X). The column for the class intervals may 
or may not be included at the pleasure of the computer. 

As another illustrative example, we shall compute M for the 
distribution of grades in college algebra previously exhibited in 
Table 8. (Note the application of assumption 2 above.) 


Table 15. Grades in College Algebra: Computing M 


X 

f(X) 

Xf(X) 

95 

4 

380 

90 

6 

540 

85 

12 

1,020 

80 

19 

1,520 

75 

37 

2,775 

70 

24 

1,680 

65 

11 

715 

60 

6 

360 

55 

4 

220 

50 

2 

100 

Total 

125 

9,310 


M = 


9310 

125 


74.48 c.u. 


The sum of the original grades in Table 6 is 9,313, thus giving the 
arithmetic mean from the ungrouped data to be 74.504. In either 
case if the values are rounded to one decimal place we have: 

M - 74.5 c.u. 

The extreme closeness of the two results is accounted for by our 
choosing as mid-points of the class intervals the values 60, 65, 70, 75, 
etc., at which the original data were heavily loaded. 


23. THE ARITHMETIC MEAN AS A MOMENT 

The term moment is one which the statisticians have borrowed 
from the subject of mechanics, where “the moment of a force is its 



THE ARITHMETIC MEAN AS A MOMENT 63 

tendency to produce rotation.” Thus if we have a weight of 10 pounds 
suspended from a horizontal bar at a point 5 feet from the fulcrum 0 , 

Diagram 2 

< 5 ft. > 


10 lbs. 

the first moment 1 of the force about *0 is 10 X 5 = 50 foot pounds, 
which is the tendency of the force to produce clockwise rotation 
about 0. If we have two weights of 25 pounds and 10 pounds sus- 
pended at distances of 4 feet and 5 feet respectively from and on the 
same side of 0, the total first moment of the two forces about 0 is 
25 X 4 + 10 X 5 = 150 foot pounds, which is the tendency of the 
two forces to produce clockwise rotation about 0. 

Diagram 3 

< 5 ft. > 

1 ' 1 4 I - — 

0< 4 ft. » I 

25 lbs. 10 lbs. 

It is evident that a single force of 50 pounds suspended 3 feet from 
0 on the same side would produce the same turning effect. But 

Diagram 4 

\ 1 1 

0< 3 ft. > 

50 lbs. 

where could we suspend both the 10-pound and the 25-pound weights 
(or a single 35-pound weight) in order that they would produce the 
same first moment as the 25-pound and the 10-pound weights when 
located as above? 

Diagram 5 

0 

« x ft. » 

35 lbs. 

1 If the weight be multiplied by the square of the distance, the product is called 
the second moment of the force about 0. 



64 


MEASURES OF CENTRAL TENDENCY 


We evidently have the equation: 

35# - 150 

from which 

X — 4y ft. 

Let us now consider the frequencies as weights or forces acting at 
the distances from 0 determined by their class marks as the figure 
indicates. The total clockwise turning effect (the first moment) of 
all the frequencies is: 

X 1 f(X l ) + X 2 f(X 2 ) + X Z f(X Z ) + . • . + Xnf{Xn) 

That is, the total first moment of the several frequencies is 2 Xf(X). 

Diagram 6 

Xi x 2 x 3 x n 

_ 

/(A) /(A) /(A) /(A) 

Now where can the sum of the frequencies, Xf(X) or N, be sus- 
pended in order that it may produce the same turning effect? Evi- 


Diagram 7 
M 

0 

♦ M > 

2/(X) 

dently at the point M, since by (2) : 

MXf(X) = XXf(X) 

Hence M is that point in the X scale at which the total frequency 
may he suspended so that the first moment of the total frequency about O 
equals the total first moment about O of the several frequencies. 

Let us look at this matter now from a statistical rather than from 
a statical point of view. The preceding discussion has really been 
concerned with statical moments. By Theorem II of Section 4 
(p. 9), we can write: 



THE ARITHMETIC MEAN AS A MOMENT 


65 


M = 


2X/(X) 

N 


= 2X 


m 

N 


= Y f(X ,) . Y /(X 2 ) /(X,) 

A, + A 2 + X 3 


+ 


+ x„ 


fjX. n) 

N 


f(X) 

If we suspend the quantities - - - ? i = 1, 2, 3, . . n, at the 

points designated by the class marks, it is evident that M is the 
tendency of the several frequencies, when each is divided by AT, to 
produce rotation about 0; that is, M is the first statistical moment 
about O. 


Diagram 8 

Xx X 2 X 3 X n 

O ill 1 

f(X i) /(X 2 ) /fXa) f(Xn) 

N N N N 

The nth statistical moment of a frequency distribution about any 
point A is defined as: 

y f1 nf(X % ) __ 2rf?/(A\) 

^ ‘ AT "X 

where d t is the distance from A to X l and/(X,) is the frequency cor- 
responding to Xi. 

Another simple but important moment property of the arithmetic 
mean is contained in the theorem: the first moment of a distribution 
about the arithmetic mean is zero. 

We shall indicate the deviation of any measure from the arithmetic 
mean by x. That is, x x = X t — M. Applying Theorems I and II 
of Section 4, and formula (2) we have 

First moment about M = 2z® = -^2(X - M)f(X) 

= 1[2X/(X) - M2/(X)] 

= ~ [MX - MX] = 0 



66 


MEASURES OF CENTRAL TENDENCY 


Of course the corollary immediately follows that 

2xf(X) = 0 

The arithmetic mean is then the “ center of balance” or the “ center 
of equilibrium” of the frequencies. That is, it is the point about 
which* the frequencies suspended as weights will balance or be in 
equilibrium. 

Let us examine the formula 2x/(X) = 0 for its algebraic meaning 
to statistics. Each x is a deviation of a corresponding X from the 
arithmetic mean: that is, Xi = Xi — M. Since each x occurs f(X) 
times, the quantity Xxf(X) gives the algebraic sum of the deviations 
from the arithmetic mean of measures grouped in a frequency distri- 
bution. Thus we have the theorem: if N measures are arranged in 
a frequency distribution , the algebraic sum of the deviations from M is 
zero. [See Exercise 5 of the next list.] 

To illustrate these important properties, consider the following 
distribution: 


Table 16 


X 

f(X) 

Xf(X) 

x = A - 70 

xf(X) 

82.5 

1 

82.5 

12.5 

12.5 

77.5 

3 

232.5 

7.5 

22.5 

72.5 

8 

580.0 

2.5 

20.0 

67.5 

10 

675.0 

- 2.5 

- 25.0 

62.5 

4 

250.0 

- 7.5 

- 30.0 

Total 

26 

1820.0 


00.0 


By formula (2) we find 


M = 


1820 

26 


70 


The second part of the table follows immediately. 

We may see what this theorem means graphically by considering 
the following diagram. We see that the counter-clockwise moment 
(turning effect) about — 70) is balanced by the clockwise mo- 
ment about this point for the clockwise moment equals + 55 and 
the counter-clockwise moment equals — 55. Thus, the point 70 is 
the center of equilibrium. 



THE ARITHMETIC MEAN AS A MOMENT 
Diagram 9 


67 



'I- >1 •I' >i 

4 10 8 3 1 f(X ) 


These moment considerations have led some authorities to call 
the arithmetic mean for grouped data, as defined by Formula (2), 
the weighted arithmetic mean. In contradistinction the arithmetic 
mean for ungrouped data, as defined by Formula (1), they call the 
unweighted arithmetic mean. 

Note. The following list of exercises is given primarily to prepare the 
student for a facile reading of Section 24. The several exercises should 
be solved in detail. 


X 

f(X) 

2.5 

2 

5.0 

4 

7.5 

8 

10.0 

4 

12.5 

2 


EXERCISES 


Draw a moment-diagram (see Diagram 8, page 65) 
for the data of the adjacent distribution. Compute 
the first statistical moment about 0 of these data. 



68 


MEASURES OF CENTRAL TENDENCY 


2. Compute M for the distribution of lengths of beans described in Exer- 
cise 1 on page 41. In what unit is M measured? 

3 . Compute M for the distribution described in Exercise 3 on page 42. 
In what unit is M measured? 

4 . Complete the following: 


Xi = 1 Xi — M = 1 - (4) = (~ 3) = xi 
X 2 = 2 X 2 - M = 2 — ( ) = ( ) = z 2 

X 3 = 4 X* — M =4 — () = ( ) = z 3 

X 4 = 5 X a - M =5 — ( ) = ( ) = a* 

X 5 = 8 Xi - M = 8 — ( ) = ( ) = Xs 

XX = () XX - 5 M = 0 = Xx { 

M = (4) 


6. In our notation x x designates the deviation of X% from M, i.e., 


Zi — M = Xi 
X 2 - M = 


x t == Xi —.M. Completing the adjacent table we 
arrive at the theorem: the algebraical sum of the 
deviations of a group of numbers from their arithmetic 
mean is zero. 


X N -M = 

XX - NM = Xx 
XX - N( ) = Xx 
0 = Xx 

6. We may frequently save labor in statistical computations by referring 
the numbers to some new origin. Consider the numbers X: 20, 25, 28, 
30, 32. Referred to X == 25 as origin these numbers become U: — 5, 0, 
3, 5, 7. 


M x 


1 " 


— j — . — 




20 

L 

(X scale) 

25 

28 

30 

32 

1 

1 


1 I i i 

-5 (U scale) 0 M v 3 5 7 


XU — 5 + 0 + 3 + 5 + 7 10 0 

5 5 5 1 


on the U scale which corresponds to 27 on the X scale. That is, M x — 27. 

7 . The U and X in Number 6 are evidently connected by the relation 
U — X — 25, or X - U + 25. Replace X by this value, in formula (1), 
page 60, and show that 


M x 


-¥-»+¥ 



THE ARITHMETIC MEAN AS A MOMENT 69 


8 . 


X 

U « X - 25 

20 

( ) 

25 

( ) 

28 

( ) 

30 

( ) 

32 

( ) 


( ) 


Complete the table and find M x from the formula 
derived in Number 7. 


9. Find M x of the numbers 315, 330, 345, 360, 375, 395, 400, by selecting 
the new origin at X = 350. Proceed as follows: 

First: Derive the formula for M x using U = X — 350 and N — 7. 

Second: Prepare the table and substitute in the derived formula. 

10 . Find M x of the numbers 228, 232, 234, 236, 238, 240, 243, 247, 
by selecting the new origin at X = 240. 

11 . Find M x of the numbers 215, 230, 245, 260, 275, 295, 300, by selecting 
the new origin at X — 250. 

12 . Find M x of the numbers 523, 534, 536, 538, 540, 543, 547, by selecting 
the new origin at X = 540. 

13 . Can you think of two simple ways to find M x for the numbers 75, 
150, 225, 375? Explain them. 

X 

Hint: Let U = — or X = 7 5 L/, and proceed. 

14 . Find M x of the numbers 128, 256, 384, 512, 640, 768. Note that 
each number is divisible by 128. 

16 . Prove that if U — X — A or X = U + A, A being a constant, 
then, using (1), 

Mx = A +~ = A + M V 


Class 

X 

f(X) 

, X -20 
X 5 

x'f(X) 

2.5- 7.5 

5 

2 



7.5-12.5 

10 

6 



12.5-17.5 

15 

11 



17.5 - 22.5 

20 

16 



22.5 - 27.5 

25 

10 



27.5 - 32.5 

30 

6 



32.5 - 37.5 

35 

3 



Totals 


54 





70 


MEASURES OF CENTRAL TENDENCY 


(1) Complete columns 4 and 5, and find 2 x'f(X). 

X 20 

(2) Note that x f = — - — or X - 5x' + 20. Replace X by this 
value in formula (2), page 61, and show that 

M x = 20 + 

54 

(3) Substitute and find M x - 

X 

17. Prove that if U - -r> k being a constant, then, using (1), 

ic 

,, U ... 

Mx = = kM v 

18. If X - U + h or U = X — h, 

h being a constant, show from equation (2) that: 

jfc.i + SSSS 

This transformation is equivalent to moving the origin to the point ( h , 0), 
the unit of measure remaining the same. 

19. Let h = 75, and use the conclusion in the preceding exercise to find 
Mx for the data in Table 15. The tabular diagram should be as follows: 


Mx for Table 15 when h = 75 


X 

f(X) 

iO 

1 

* 

II 

b 

Uf(X) 

95 

4 

20 

80 

90 

6 

15 

90 

etc. 

etc. 

etc. 

etc. 

Total 





20. Let x' — — or wx' = X , 

w 

w being a constant, and show, using (2), that 

This transformation is equivalent to expressing the variates in class units, 
the origin remaining the same. 



THE ARITHMETIC MEAN AS A MOMENT 


71 


21. Let w = 5, and use the conclusion in the preceding exercise to find 
M x for the data in Table 15. The tabular diagram should be as follows: 


Mx for Table 15 when w = 5 


X 

f(X) 

II 

x'f{X) 

95 

4 

19 

76 

90 

6 

18 

108 

etc. 

etc. 

etc. 

etc. 

Total 


! 



22. Using h = 54, and the results of Exercise 18 above, find M x for the 
distribution in Exercise 3, page 42. 


24. A SHORT METHOD FOR COMPUTING 
THE ARITHMETIC MEAN 

It frequently happens that the distribution under consideration has 
large values for X , large values for f(X), or large values for both 
X and /(X), and the consequent arithmetical work for computing 
Mx and other statistical constants becomes very tedious. In such 
cases it is convenient, sometimes necessary, to simplify the numbers 
so that we can save much labor. Three possible steps may be taken. 
We may: 

1. Change the unit of measure as in Exercise 20, page 70. 

2. Express the variates as measures from some new origin (frequently 
called the provisional mean or the guessed mean ) as in Exercise 19, 
page 70. 

3. Combine 1 and 2 to change the unit of measure and express the 
variates as measures from some new origin as in Exercise 16, page 69. 

We shall derive the appropriate formula for the third possibility, 
and show that the others are special cases of it. To do this let: 1 


where: 
h = 


w = 


x 


X = 



or X = vox' + h 


the distance in original units from 0 to the new origin O' 
the class width 

the deviation of X from h expressed in class units 


1 See Figure 1, page 73. 



72 


MEASURES OF CENTRAL TENDENCY 


Then applying Theorems I and II of Section 4 (p. 9), we have: 


2X/(Z) 2(m' + h)f(X) 2 wx'f(X) , 2hf(X) 

N ~ N N + N 

■w2x'f(X) hSf(X) 

~ N ^ N 


M = h+ ’ssqm 

since 2 f(X) = N. 


(3) 


The quantity — — - 


is usually denoted in statistical work by b x 


or by v[ (read: nu one prime), and is called the first moment about 
the arbitrary origin (h, 0), expressed in class units. Hence we have: 


where 


M = h + tvb x = h + wp[ 

Xx'f(X) 

N 


(3a) 


If h = 0, we get the results previously mentioned in Exercise 20 
on page 70, and if w = 1, we get the results equivalent to those men- 
tioned in Exercise 18 on page 70. We shall refer to the computation 
of M by (3) as “the short method of computing M” 

Figure 1 on page 73 gives the graphical representation of this 
development. We shall refer to this figure many times, hence it 
should be well mastered. 

We have here the frequency curve Y = f(X) referred to the axes 
OX and OF. The point P whose coordinates are (X, /(X)) is any 
point on the curve. 

O' = (h, 0) is the arbitrary origin or guessed mean. It should be 
chosen at a class mark near the center of the distribution. 

wb x = the distance from O' to M. 

Evidently: 

X = h + wx f 
OM = M = h + wb x 

The distance MR , which is the deviation of any measure from the 
mean, will be needed in the next chapter. It is represented by 
small x . 



SHORT METHOD FOR ARITHMETIC MEAN 73 


Figure 1 



Let us apply formula (3) to compute M for the distribution of 
Table 15. 

Let h — 75 and w = 5. We then have: 

X = 5x' + 75 or x ' = — — r— ^ 

5 

and Table 15 becomes: 

Table 17. M for Table 15 when h = 75 and w — 5 


X 

AX) 

, X - 75 
* = 5 

x’AX) 

95 

4 

4 

16 

90 

6 

3 

18 

85 

12 

2 

24 

80 

19 

1 

19 

75 

37 

0 

0 

70 

24 

- 1 

- 24 

65 

11 

- 2 

- 22 

60 

6 

- 3 

- 18 

55 

4 

- 4 

- 16 

50 

2 

- 5 

- 10 

Total 

125 


- 13 



74 


MEASURES OF CENTRAL TENDENCY 


* _ S x'f(X) - 13 

N 125 

M = h + wb x = 75 + = 74.48 


EXERCISES 


1. Using the short method, compute M for the distribution (a) of 
heights, Exercise 1, page 54. 

2 . Compute M of weights, Exercise 1 (b), page 54, by the short 
method. 

3 . Compute M for the distribution of head-breadths, Exercise 2, 
page 54, by the short method. 

4 . a. Show that M for the first N integers, 1, 2, 3, . . . , N is (N + l)/2. 
b. Show that M for the first N odd integers, 1,3, . . . , (2N — 1) is N. 

6. The salaries of 100 male employees of the Smith-Jones Machine 

Company were arranged into two groups of 40 and 60 men with mean 
weekly salaries of $24.96 and $36.47 respectively. What was the mean 
salary of the total group? 


1st group 

2nd group 

Total group 

Ni = 40 

N 2 = 60 

N = 100 

Mi = $24.96 

M 2 = $36.47 

M = { ) 


6. Twenty-five employees of the Smith-Jones Machine Company 
earned $764.38 in a week, and fifteen other employees earned $638.92 
during the same period. What was the mean weekly salary of the forty 
employees? 

7. If in a series of N i observations, the arithmetic mean is M h and in a 
second series of N 2 observations, the arithmetic mean is Af 2 , show that for 
the entire group of N = N\ + iV 2 observations: 


Combined mean M 


N\M\ + N 2 M 2 
N 


8. Generalize Exercise 7 above for n groups and show that: 


Combined mean M 


NjMi + N 2 M 2 + • • • + NnMn 
N 


2 A r t Mi 
N 


where N = Ni + N* + • • • + N n 

9. Prove: M a x = aM x » Illustrate. 

10. Prove: M aX +b == aM x + b. Illustrate. 

11 . The sales record of a certain firm showed the following items: 
800 articles at 10 cents; 400 articles at 25 cents; 300 articles at 50 cents. 
What was the average price per article? 

12. The following data taken from Bulletin 435 of the U.S. Bureau of 
Labor Statistics, “Wages and Hours of Labor in the Men’s Clothing 



SHORT METHOD FOR ARITHMETIC MEAN 


75 


Industry, 191 1—1926, ” give the weekly earnings in 1926 of Hand Sewers 
on Men’s Coats in St. Louis, Cincinnati, and Cleveland. Compute M 
for each distribution. 


Weekly earnings 

Number of Employees 

Cincinnati 

Cleveland 

St. Louis 

$0 

a.u. 

$2 

1 

1 

0 

2 

a 

4 

1 

2 

2 

4 

(( 

6 

2 

1 

2 

6 

a 

8 

2 

1 

4 

8 

u 

10 

6 

3 

11 

10 

u 

12 

14 

4 

13 

12 

u 

14 

27 

10 

28 

14 

(( 

16 

15 

12 

28 

16 

u 

18 

19 

14 

29 

18 

a 

20 

15 

22 

21 

20 

u 

22 

9 

28 

13 

22 

u 

24 

7 

33 

6 

24 

u 

26 

7 

26 

2 

26 

a 

28 

4 

14 

2 

28 

a 

30 

2 

18 

4 

30 

(( 

32 

2 

13 

1 

32 

(( 

34 

3 

3 

1 

34 

(( 

36 

1 

4 

1 

36 

(( 

38 

3 

0 

2 

38 

(( 

40 

0 

0 

1 

Total 

140 

209 

171 


Strength 
(lbs. per sq. in.) 

Number of Bricks 

230- 370 

1 

380- 520 

1 

530- 670 

6 

680- 820 

38 

830- 970 

80 

980-1120 

83 

1130-1270 

39 

1280-1420 

17 

1430-1570 

2 

1580-1720 

2 

1730-1870 

0 

1880-2020 

1 

Total 

270 


The data of the adjacent table 
give the transverse . strength of 
bricks in pounds per square 
inch. They are taken from: 
American Society of Testing Ma- 
terials , Vol. 33, Part I, p. 458. 
(Measurements made to nearest 
10 pounds.) 

Compute M . 



76 MEASURES OF CENTRAL TENDENCY 

25. THE MEDIAN, M d 

A second measure of central tendency, one that has a wide usage 
in statistical work, is the median. Roughly speaking, the median of a 
set of numbers is the middle one of the set when they are arranged in 
order of magnitude. Thus, if the set of numbers 33, 93, 45, 83, 72, 
97, 21, 67, 91, 46, 82 be arranged in the order of their size: 21, 33, 45, 
46, 67, 72, 82, 83, 91, 93, 97, the middle number, 72, is called the 
median number. Since there are eleven numbers in the above set, 
the sixth number is the median. In general, if there are N numbers 
in a set arranged in the order of their size (i.e., “arrayed”)* the 
median number is the one that corresponds to (N + l)/2. If A is 
even, obviously there is no middle number. In this case the median 
is commonly taken to be one-half the sum of the two middle numbers. 
For example, the median of the set 6, 7, 9, 12, 16, 20 is usually taken 
to be (9 + 12)/2 = 10.5. 

If the measures are tabulated in a frequency distribution, we shall 
define the median as the point on the X-scale such that one-half the 
measures are below it and one-half are above it. On the histogram, 
frequency polygon, or frequency curve, it is that point on the X-axis 
at which, if an ordinate is erected, the area of the histogram, polygon, 
or curve will be bisected. The class interval in which the median is 
found is called the median class. 

In Section 18 (p. 48) we had a little work in computing the median 
in simple situations. Let us now derive a formula for finding M d 
by looking at the matter from a slighly different point of view. 

Let: N — the total frequency 
w — the class width 

b 2 = the lower class boundary of the median class 
B 2 = the upper class boundary of the median class 
n 2 = the total frequency of all classes less than b 2 
N 2 = the total frequency of all classes greater than B 2 
f 2 — the frequency of the median class 
z 2 = the distance from b 2 to the median 
M d = the median 

Since w is the width of each rectangle and the altitudes are/(Xi), 
/(X 2 ), . . .,/(X n ), the area of the histogram is: 



THE MEDIAN 


77 


area = wf(X i) + wf(X 2 ) + • • • + wf(X n ) 

= w\J(X x ) + f(X 2 ) + • • • +/(X n )] = ti>2/(X) = wiV 

Figure 2 


Lfl T 7 



That is, area wN represents N measures, and therefore 
wN N 

area -y represents measures, and 

area wn 2 represents rh measures. 

From the figure we have: 

ABKbz + bJLRM d = ~ 

wN 

or VM 2 + f 2 Z 2 = -y 

from which we obtain 

*2 = 

Hence the median is given by: 

(i - 

Md = ^2 + ^2 = + 1 J W (4) 

The student should note especially that the value of the median 
requires the class boundary, not the class mark, of the median class. 
Once the median class is determined we know immediately N/2, n 2 , 
& 2 , and/ 2 . Then computing M d is decidedly simple. 

Our first task, then, in computing M d is to determine the median 
class. To do this we find N/2, begin at the lower end of the scale 




78 


MEASURES OF CENTRAL TENDENCY 


and add the frequencies in the successive classes until the lower 
limit, 62, of the class containing the median is reached. We then 
have the median class and, incidentally, n 2 . 

Next we find N/2 — n 2 , and observe the frequency of the median 
class, / 2 . We now have all the elements required by formula (4; ; 
hence, substituting the values, we find M d . 

Consider the data in Table 18 as an illustrative example. 


Table 18. Computing M d for 
Semester Grades of 125 
Students in College 
Algebra 


Class 

/(z) 1 

92.5-97.5 

4 

87.5-92.5 

6 

82.5-87.5 

12 

77.5- 82.5 

72.5- 77.5 

19 

— 27.= ft 

67.5- 72.5 

62.5- 67.5 

57.5- 62.5 

24 T 

4 > 

52.5-57.5 

4 n 

47.5-52.5 

2 ^ 

Total 

125 = N 


We have: 

w * 5 
N = 125 


/ 2 = 37 
6 2 = 72.5 
n, = 47 

Hence by (4) : 

M d = 72.5 + ( Q - ^ 5 

= 74.595 = 74.6 (approx.) 


Employing the assumption made in Section 12 to the effect that 
the items of a class are uniformly or evenly distributed over the 
interval, we can find the median by simple interpolation and thus 
be freed from the tedium of remembering a formula. 

Consider the data of Table 8 on page 26. We count from the 
lower values and determine 72.5-77.5 to be the median class. Be- 
low this class are found 2 + 4 + 6 + 11 + 24, or 47, scores. We 
need to move up the scale above 72.5 a distance z 2 until we obtain 
15.5 scores from the 37 scores of the median class, and thus have 
47 + 15.5 = 62.5, or N / 2, scores. By simple proportion we set up 
the equation for determining z 2 and thus find M d . The following 
diagram may assist in understanding the solution. 

1 From this point forward we shall designate the class frequency corresponding 
to X, x\ or x by /(z). 



THE MEDIAN 


•79 


Diagram 

Distances 

77 ^ 

10 

Frequencies 

K 1 « .U 

t 

5 M d , 

37 

T 

T i 

2 2 

! 15.5 ! 

' 4 r 

I l I 

i Z.O 



22 _ 15.5 
5 “ 37 


_ 5(15.5) 
37 


2.095 


M d = 72.5 + 2 2 = 74.595 c.u. 


EXERCISES 


Compute the medians of the following distributions: 

1. The distribution of Exercise 1, page 41. 

2 . The distribution of Table 13, page 49. 

3. The distributions (a) and (b) of Exercise 1, page 54. 

4 . The distributions of Exercise 12, page 74. 

6. The distribution of Exercise 13, page 75. 

6. Refer to Figure 2, and by equating to wN / 2 the area VSTRM d V t 
show that the median is given by: 


M d = B 



w 


7 . 


Class 

fix) 

30 

a.u. 

33 

4 

27 

a 

30 

8 

24 

u 

27 

16 

21 

u 

24 

0 

18 

u 

21 

12 

15 

<( 

18 j 

12 

12 

u 

15 

4 

Total 

56 


According to the definition, what may be the 
median of the adjacent distribution? At what 
point would you take the median? 


8. Compute the median for the distribution of Exercise 3, page 42. 
Since this is a distribution of discrete data, what interpretation can you 
give to this median? 



80 


MEASURES OF CENTRAL TENDENCY 


26. THE MODE, M Q 

A mere glance at Table 8 (p. 26) informs us that the class inter- 
val 72.5-77.5 has the greatest frequency. It is called the modal class. 
The class mark, 75, of the modal class is called the crude mode. 

The mode may be roughly defined as the measure that occurs 
most frequently. The modal height of twelve-month-old white boys 
is about 29 inches, for there are more twelve-month-old white boys 
29 inches high than for any other height. Any haberdasher will 
tell you that there are more calls for shirts of size 15 than for any 
other size; hence the modal size for shirts is 15. The mode is the 
typical measure, the fashionable measure, la mode. It is probably 
what the layman understands as the “ average. ” 

The true mode is easy to define but very difficult to determine. 
The true mode is the value of X at which the ideal frequency curve 
which best fits a set of data has a maximum. Of course the subject 
of fitting frequency curves in general is beyond the scope of this 
text, but we may state that the ideal curve for a given distribution 
is difficult to find. 

The mode is roughly approximated by the mid-point of the class 
with the greatest frequency. We appropriately call this value the 
crude mode. We obtain a closer approximation to the true mode by 
making a correction upon the crude mode. This correction is made 
by a process of interpolation. Such interpolation is usually based 
upon the values that determine the modal class and its two adjacent 
classes which we choose to call the three “ central” classes. 

While it is true that for most mound-shaped distributions the 
mode is in the central part of the distribution, it is not unusual to 
encounter a mode near one of the extremes of the distribution. When 
this does occur the mode is certainly an important measure of central 
tendency. 

Of the several methods we shall use to determine an approximate 
mode, probably the method-of-the-parabola is the best. Although 
the mode, like the median, does not behave beautifully in the algebra 
of statistics and does not integrate conveniently in the description 
of the more complex features of statistical phenomena, it deserves 
a careful consideration. Let us now proceed to the problem of this 
section, how to find an approximate mode. 



THE MODE 


81 


Consider the grades in College Algebra, Table 8 (p. 26). The modal 
class and the two adjacent classes are 


X 

/(*) 

80 

19 

75 

37 

70 

24 


There is a well-defined modal class, namely, that with the class 
mark of 75. Further, since there are 24 members in the 70 class and 
19 members in the 80 class, certainly the mode should be drawn from 
75 toward 70 because of the added weight of the 70 class. Evidently 
the mode is located in the 72.5-77.5 interval, the point to be de- 
termined by the weights of the adjacent classes. Consider the fol- 
lowing diagram. 


Diagram 11 

72.5 Mo 75 77.5 



Let the frequencies 24 and 19 be considered as weights suspended 
at the ends of the modal class interval. Let z be the amount that must 
be added to the lower boundary 72.5 to give the approximate mode. 
In order that the weights shall balance at M oy we must have: 

24 2 - 19(5 - z) 

from which we obtain 

2 = 2.2 

and 

M 0 = 72.5 + z = 74.7 

This illustrates the well-known method given by Professor W. I. 
King in his Elements of the Statistical Method , page 124. In general, 
let: 

/-i = the frequency of the class next lower than the modal class 
fi = the frequency of the class next higher than the modal class 
b = the lower boundary of the modal class 



82 


MEASURES OF CENTRAL TENDENCY 


B = the upper boundary of the modal class 
w = the class width 

z = the amount which must be added to b to give M 0 

If the frequencies are suspended as weights at the ends of the 
modal class interval, in order for the weights to balance at M 0 , we 
must have: 


Diagram 12 

b Mo B 



and 


M a =b + z = b + ( j- w (5) 

It may be argued that, to be consistent with Section 22 (p. 60), the 
frequencies should be suspended at the mid-points of the respective 
class intervals. We shall give some exercises at the end of this section 
that will involve that very point. 

A second, and possibly a closer, approximation to the mode can 
be found by passing a quadratic parabola through the three central 
points and finding the value of X for which Y or f(x) has a maximum. 

The student may recall from elementary or college algebra that 

Y = aX 2 + bX + c 

represents a parabola; that it has a maximum if a is negative, a 
minimum if a is positive, as in Figures 3 and 4. It can be shown 
in several ways that the coordinates of the bend points, m and 
M y are: 

/ b 4 ac — b 2 \ 

V 2a 4a ) 




That is, if a is negative, the value of X for which aX 2 + bX + c is a 


maximum is: 


For example, aX 2 + bX + c can be put into the form: 


a X + 


If a is negative, the largest value is obtained when X + ^ = 0; that 
b „ 


( 75 , 57 ) 


when X = — 


(70M\ 


180,19 ) 


is, when X = — • Figure 5 

Similarly, if a is pos- 

itive, the smallest N. 

value is obtained / 

when X = - ~ (70 M)/ 

Let us apply this / \( ’80,19 ) 

method to the dis- / \ 

tribution of grades / \ 

in college algebra 

the three “ central” % 

classes for which ^ ^ 

were given on page 

81. When they are plotted they appear as in Figure 5. 

We have the three points on the curve as shown. The equation 
of the curve is: 


Y « AX 2 + BX + C 



84 


MEASURES OF CENTRAL TENDENCY 


Substituting the coordinates, we have: 

24 - A( 70) 2 + £(70) + C 
37 « A{ 75) 2 + £(75) + C 
19 = A(80) 2 + £(80) + C 

Solving for A> £, and C, we obtain 

A = - 0.62, £ = 92.5, C = - 3413 
and the equation of the curve passing through the three given points 
is: 

F = - 0.62Z 2 + 92.5X - 3413 
The value of X for which F is greatest is 
v £ 92.5 ^ 

X “ 2A~ 1.24 ~ 74-597 C-U- 

and this is an approximate mode. 

The algebra can be greatly simplified 1 by using the class width 
( w = 5) as a unit and by moving the origin to the point (75, 0) 
where h = 75, the crude mode. We then have: 

X = bx + 75 or x = 

5 

and the equation of the curve is now of the form: 

F = ax' 2 + bx' + c 

Figure 6 exhibits the (&', F) coordinates of the three “ central 
points. 


Figure 6 



1 See Exercises 40 and 41, page 110, for solutions based upon determinants. 



THE MODE 85 

Substituting the coordinates, we have: 

24 = a(- l) 2 + 6(- 1) + c 
37 = a(0) 2 + 6(0) + c 
19 = a(l) 2 + 6(1) + c 

Solving for a, 6, and c, we have: 

a - — 15.5, 6 = — 2.5, c = 37 
The value of x f at the mode is 

, 6 2.5 

* 2a 31 

and the value of X at the mode is : 

i o 5 

X « 5x' + 75 = - -gp + 75 = 74.597 as before. 1 

For mound-shaped distributions that are moderately asymmetrical 
and also possess a moderate peakedness near the center, as in Figure 7, 
the formula, due to Karl Pearson, 

M x - Mo = 3 (M x - M d ) 

has been found to be approxi- 
mately true. Since the median 
and the arithmetic mean are not 
difficult to compute, this formula 
may be used to advantage in 
finding M 0 for certain types of 
distributions. Owing to the fact that the distribution of college 
algebra grades is very peaked, this formula cannot be expected to 
check very satisfactorily. 


Figure 7 



M d 


EXERCISES 

Find the approximate modes by three different methods for each of the 
following distributions: 

1. Of Exercise 1, page 41, 3. Of Exercise 1(a), page 54. 

2. Of Exercise 3, page 42. 4. Of Exercise 2, page 54. 

5. Assume that the class frequencies /_ i and /i of the classes adjacent to 

1 The method of determining an approximate mode by passing a quadratic 
parabola through three points gives the same result as the method of finite differ- 
ences given by Czuber, Die statistischen Forschungsmethoden , p. 71, which is men- 
tioned by Professor Rietz in the Handbook of Mathematical Statistics , p. 27. 



86 


MEASURES OF CENTRAL TENDENCY 


the modal class are suspended as weights at their class marks as in the figure, 
and show that, if the weights balance at M 0 \ 


and 


z = 



w 




Diagram 13 





i 

> 


M 0 

B 




w 






3 




~ 2~ 



2 



-w — z 


f- 

f 

-1 








/. 


6. Assume that / 0J the frequency of the modal class, and /_ i and f u the 
frequencies of the classes adjacent to the modal class, are suspended as 
weights at their respective class marks, as in the figure, and show that if 
the weights balance at M 0 : 


and 


2 


7» + 3/i-/-i 1 

20U+/O+/0J 


w 


M 0 = b + z- b-\- 


- f 0 + 3/i - /-I ~| 
2(/-i+/o+/i)_T 


Diagram 14 



7. Show that the value of M 0 in Exercise 6 above is the arithmetic 
mean of the modal group and the two groups adjacent to it; that is, show 
that: 


M 9 


X-\f-\ + Xofo + Xxfi 

f - 1 + fo + fl 


Hint: = b - X, - b + and X, = 6 + ^ 

& £d it 



THE GEOMETRIC MEAN 


87 


27. THE GEOMETRIC MEAN, M 0) AND KINDRED TOPICS 

In the preceding pages of this chapter considerable attention has 
been devoted to the three measures of central tendency that are most 
1Q widely used — the arithmetic mean, the median, and 
the mode. 

x 2* Not all data are most logically averaged by these 

_ — measures F rom several points of view the best aver- 

2 4 age for the numbers 2, 4, and 8 is not 4f, their arith- 

3 8 metic mean, but 4 = \/2 • 4 • 8, their geometric mean. 

^ ^ The most logical average for the set of numbers 2, 4, 8, 

0 04 16, 32, 64, 128 is their geometric mean, the seventh 

7 128 root of their product, which is 16. It is a member of 

' " ~ the group and, when the data are plotted, is in the 

curve or trend of the data. The geometric mean is represented by 
the point B , and the arithmetic mean by the point C (4, 36f). 


X 

2 * 

1 

2 

2 

4 

3 

8 

4 

16 

5 

32 

6 

64 

7 

128 




MEASURES OF CENTRAL TENDENCY 


We have learned in college algebra that series in which the quanta 
ties increase or decrease at each interval by a constant percentage of 
the value at the beginning of the interval are in geometric progression. 
In different words, if the ratio of any number to the preceding number 
is constant, the numbers are in geometric progression. 

It is to such classes of numbers that the geometric mean most 
logically applies as an average. In observed data it is not expected 
that the ratio of any number to the preceding number will be abso- 
lutely constant; however, if the ratio is approximately constant , in 
such data the geometric mean is preferable. 

The geometric mean, then, is widely used in averaging rates of 
increase or decrease, such as in the study of the growth of any 
statistical population, growth of skill in an individual, relative 
changes in the prices of commodities — in short, any data that 
approximately satisfy the previously stated criterion. 

Consider the following table : 

Table 20. Population of the Continental United States 1 


Year 

Population 

X 

( millions ) 

Ratio of Each 
Item to the 

One above 

1910 

92.0 


1920 

105,7 

1.15 

1930 

122.8 

1.16 


Since in this particular period of twenty years the ratios are 1.15 
and 1.16, essentially constant, we assume that the populations are 
in geometric progression, and their average would be their geometric 
mean, namely: 

M g = \/ (92.) (105.7) (122.8) 

To evaluate this we shall use logarithms, and write: 

log M g = £[log 92.0 + log 105.7 + log 122.8] 

= £[1.9638 + 2.0241 + 2.0892] 

= £[6.0771] = 2.0257 
and M g = 106.1 millions 

1 The data are taken from the Fifteenth Census of the United States , Vol. I, 
Population, p. 6. 



THE GEOMETRIC MEAN 


89 


Further, if we assume that the decade rate of growth will continue 
for the next decade, then the population for example in 1940 will be 
1.16(122.8) = 142.4 millions. 

We have noted that the decade rate of growth from 1910 to 1920 
is 1.15 or 115 per cent. Suppose we are interested in the annual 
rate of growth , which we assume is constant during the decade, then 
we may interpolate the population for the years 1911, 1912, etc. 

Let: 

r = the annual rate of increase 
P 0 = the population in 1910 

Pi — the population in 1911 = P 0 + P 0 r = Po(l + r) 

P 2 = the population in 1912 = Pi + Pir = P 0 (l + r) 2 


Pio = the population in 1920 = Po(l + r) 10 
Therefore: 

92(1 + r) 10 = 105.7 

To solve this equation, we may use logarithms. Hence: 

10 log (1 + r) = log 105.7 - log 92 

= 2.0241 - 1.9638 = 0.0603 
log (1 + r) = 0.00603 
(1 + r) = 1.014 

r = 0.014 = 1.4 per cent 

Hence: 

Pi = Po(l +r) = 92(1.014) = 93.3 millions in 1911 
p 2 x= p t ( i + r ) = 93.3(1.014) = 94.6 millions in 1912 

If we assume the same annual rate to continue from 1920 to 1921, 
then the population in 1921 is given by: 

Pn =Pio(l + r) = 105.7(1.014) = 107.2 millions in 1921 

We are now ready to define the geometric mean of N measures to be 
the Nth root of their product . If X h X 2 , . . . , Xn are the measures, then: 

Mg = ■ Xi - ■ ■ ■ X N 


( 6 ) 



90 MEASURES OF CENTRAL TENDENCY 

It is convenient to express this equation in logarithmic form, thus: 
log it, - + M x, + . log X, 

2 log X 
N 

In other words, the logarithm of the geometric mean is equal to 
the arithmetic mean of the logarithms of the original measures. 

If the data are arranged in the form of a frequency distribution — 
that is, if Xi appears f(x i) times, X 2j f(x 2 ) times, and so on — the 
formula becomes: 

M g = • XiW • . . . X n !^n) (7) 

where N = f(x i) + f(x a ) + • • • + f(x n ) 


Suggested Exercise: Using the frequencies in formula (7) as weights, 
show that the logarithm of a weighted geometric mean is the weighted 
arithmetic mean of the logarithms of the measures; that is: 


log M g 


3Zf(x x )logXi 

N 


Example 1 . The following table gives for the years indicated the 
number of divorces in the United States per 1,000 marriages. Find M 0 } 


Year 

No. of divorces 
per 1 ,000 marriages 
X 

Log X 

1906 

84 

1.9243 

1916 

108 

2.0334 

1922 

131 

2.1173 

1926 

150 

2.1761 

1931 

170 

2.2304 


; 

10.4815 = S log X 



2.0963 = = log M a 



124.9 = M g 


Example 2. Find the annual rate of increase of the divorce rate for 
the period 1906-1916. (See Example 1 above.) 

1 Four-place tables of logarithms and anti-logarithms are found in the 
Appendix. 



THE GEOMETRIC MEAN 91 

Solution, Let n be the rate of increase. Following the line of reasoning 
that was used on page 89, we find 

84(1 + rO 10 = 108 


Taking logarithms, 


log (1 + n) = 


log 108 — log 84 
10 

2.0334 - 1.9243 
10 


0.0109 


1 + r r = 1.026 

n = 0.026 = 2.6% 

Exercise. Find the rates of increase of the divorce rate for the periods 
1916-1922, 1922-1926, 1926-1931. 


Problem 1. Prepare a skeleton table with the proper headings for finding 
the geometric mean of a frequency distribution. See formula (7). 

Problem 2. A population P 0 increases at a constant rate r per period 
for n periods. Show that the population P n at the end of n periods is 
given by 

P n = Po(l + r) n 


Problem 3. A population P 0 decreases at a constant rate r per period 
for n periods. Show that the population, P n , at the end of n periods is 
given by 

P n = Po(l - r) n 


Problem 4. If M g ,x is the geometric mean of N X’s, and M g , Y is the 
geometric mean of N F's, then the geometric mean M g of the 2 N values 
is given by 

M\ — M gi x * M g)Y 


Problem 6 . Plot the curve: F = 100(1 .+ X)\ 


Problem 6. Plot the curve: Y = 100(1 — X) A . 
Problem 7. Prove: — - l -^— - > VX\X 2 . 

A 


EXERCISES 

1. Complete column 3 for the following data, and note that the ratio is 
approximately constant. Find the geometric mean for the number of regis- 
trations. 



92 


MEASURES OF CENTRAL TENDENCY 


Registration of Motor Vehicles in the United 
States, 1920-1929 1 


Year 

Number 

X 

( thousands ) 

Ratio of Each 
Item to One above 

Log X 

1920 

9,232 



1921 

10,463 



1922 

12,238 



1923 

15,092 



1924 

17,594 



1925 

19,937 



1926 

22,001 



1927 

23,133 



1928 

24,493 



1929 

26,501 




2. The United States gross imports of crude rubber increased from 
252,922 long tons in 1920 to 563,812 long tons in 1929. Find the annual 
rate of increase during this period, assuming that the rate of growth was 
constant. 

3. The value of a machine decreases at a constant rate from the cost 
price of $1,000 to the scrap value of $100 in ten years. Find the annual rate 
of decrease, and the value of the machine at the end of one, two, three, 
years. 

4. The number of divorces per 1,000 marriages increased from 62 per 
1,000 in 1890 to 166 per 1,000 in 1928. Assuming the annual rate of in- 
crease was constant, find its value. (See Statistical Abstract of the United 
States , 1930, p. 91.) 


28. THE HARMONIC MEAN, M h 

Another measure of central tendency that is of value in solving 
certain special types of problems is the harmonic mean. Owing to 
its unfamiliarity and to the difficulties in interpreting it, it is probably 
less used than any of the other measures of central tendency, yet for 
certain problems its use is very desirable. 

The harmonic mean is defined as the reciprocal of the arithmetic 
mean of the reciprocals of the given numbers. If the given numbers 
be Xi 9 X2 , • • . , Xn) then: 

1 The data are taken from Statistical Abstract of the United States , 1930, 
p. 385. 



THE HARMONIC MEAN 


93 


u N _N_ m 

h -+-+- + ■■■+— y- 

Xi X 2 ^ ^ x„ 

The harmonic mean of 2, 4, 6 is: 

3 = 36 

i + i + i 11 

The harmonic mean is especially useful in averaging time rates , 
in finding the average price per unit when the data give the amount 
of the commodity for a given price — “so much for a unit of money” 
— and in the development of index numbers. For example, suppose 
a man travels two miles, the first at the rate of 10 miles an hour and 
the second at the rate of 20 miles an hour, what is the average speed? 
The “obvious” answer of 15 miles an hour is not correct for the 
man traveled only two miles and he consumed only (fa + fa) of an 
hour, or fa of an hour. He would have traveled in fa of an hour at 
15 miles an hour the distance fa(l5) = 2\ miles, not 2 miles. If r 
is his average rate in miles an hour, then: 

(fa + fa) r ~ 2 

from which we obtain: 

r = 13 J miles per hour 

and this is the harmonic mean of the rates. 

As a second illustration suppose a man on a journey purchases 
gasoline as in the following table : 


Table 21. Purchase of Gasoline 


Dealer 

Number of Gallons 
for SI 

X 

Cost per Gallon 
(Dollars) 

1 

8 

i 

2 

12 

A 

3 

10 

tV 


We wish to find the average price per gallon and the average number 
of gallons for $1. 

In the preceding illustration we assumed that the average rate 
times the total time gave the total distance. Similarly we assume 


94 MEASURES OF CENTRAL TENDENCY 

here that the average price times the number of units purchased 
gives the total cost. 

CASE A. Spending the same amount with each dealer. Suppose D 
dollars are spent with each dealer. We then have (8 D + 12 D + 10D) 
gallons bought at a total cost of 3 D dollars. Hence the average price 
per gallon is (3D) -s- (30D), or 10 cents per gallon. 

CASE B. Buying the same amount from each dealer. Suppose G 
gallons are purchased from each dealer. We then have 3 G gallons 
purchased at a total cost of (G/8 + G/ 12 + 0/10) dollars. Hence 
the average price per gallon is (G/8 + G/ 12 + G/10) (30), or 

10 5/18 cents per gallon. The reciprocal of this quantity gives the 
average number of gallons for $1 and is the harmonic mean of the 
given X values. 

Exercise. A manufacturer of rivets purchased copper as follows: 
In 1918, 4 pounds for $1 ; in 1921, 8 pounds for $1 ; in 1925, 6§ pounds 
for $1; in 1932, 20 pounds for $1. Find the average price per pound 
and the average number of pounds for $1 on two hypotheses. 

The observant student will note that the price of an article may 
be expressed in two ways, 

(a) p units of money per unit of quantity, or 

(b) q units of quantity per unit of money. 

Thus, the price of sugar may be given as (a) 5 cents per pound or 
as (b) 20 pounds for a dollar or J of a pound for a cent. 

Similarly, the speed of a moving body may be expressed as 

(a) d units of distance per unit time, or 

(b) t units of time per unit distance. 

Thus the speed of a car may be given as (a) 30 miles per hour or as 
(b) 2 minutes per mile. 

Of course we are more familiar with prices and speeds expressed 
in forms (a) but we need to give attention to forms (b) since they 
do occur. Moreover, the correct average will depend upon how the 
data are stated, as the previous illustrations confirm. 

The following theorems will clarify some of the apparent confusion 
in which we find ourselves. Note that in the hypotheses and the 
conclusions of these theorems we assume that prices and speeds are 
expressed in the familiar forms (a). 



THE HARMONIC MEAN 
A. Average Speeds and Rates 


95 


If 

s = speed, number of units distance per unit time, and 
t = number of units time, 
then we define: 


Average speed = 


total distance 
total time 


2st 

Yt 


(9) 


Theorem I. If the time on each trip is constant, that is, if t = c, 
then 

A , 2st 2sc c2s 2s 

Average sperf * " 2? _ TTc ~ W (10> 

which is the arithmetic mean of the several speeds. 

Example 1 . A man traveled by auto 3 days. He drove 10 hours each 
day. He drove 

the first day 10 hours at 45 mi. per hr., 
the second day 10 hours at 40 mi. per hr., and 
the thifd day 10 hours at 38 mi. per hr. 

What was his average speed? 

Solution: Here we have the case in which the time t of each trip is 
constant and equal to 10 hours. Hence, by (10) 


Average speed 


2s 

N 


45 + 40 + 38 „ . 

— 41 mi. per hr. 


The student may verify this by formula (9). 


Theorem II. If the total distance covered each trip is constant, 

that is, if st = c, then 

. j 2st 2c Nc N 

Average speed = — = — = r— r = — (11) 

••S S Mmm ^ 

which is the harmonic mean of the speeds. 


Example 2. A man traveled by auto 3 days. He covered 480 miles 
each day. He drove 

the first day 10 hours at 48 mi. per hr., 
the second day 12 hours at 40 mi. per hr., and 
the third day 15 hours at 32 mi. per hr. 


What was his average speed? 



96 


MEASURES OF CENTRAL TENDENCY 


Solution: Here we note that the total distance covered each trip (day) 
is constant and equal to 480 miles. Hence, by (11) 


Average speed = 


N 


yi ± + 1 + -L 

AJ S 40 An ori ' 


h 48 40 1 32 


= 38§t mi. per hr. 
The student may verify this by formula (9). 


37 

480 


B. Average Prices 


Let 

p — price per unit (number of units of money per unit of 
quantity) 

q = quantity (number of units purchased at price p) 
then we define: 


Average price = 


total amount spent 
total quantity purchased 


2 q 


(12) 


Theorem III. If the total amount spent at each transaction is 
constant, that is, if pq = c, then 


Average price 


2 pq . 
2 q ' 

N 

'S; 


2c 
\c 
4 V 


Nc 


21 *2 


1 


(13) 


which is the harmonic mean of the prices. 


Example 3. Mr. Jones usually spends $120 a year for coal. He bought 

during 

the first year 15 tons at $8 per ton, 

the second year 12 tons at $10 per ton, and 

the third year 10 tons at $12 per ton. 

What was the average price of the coal? 

Solution: Here we note that the total amount spent each year is constant 
and equal to $120. Hence, employing (13) we find the 



THE HARMONIC MEAN 


97 


A • N 3 3 

Average price “ — , - , . - -g 

±Jp 8 + 10 + 12 120 

= = $9.73 per ton 

The student may verify this by formula (12). 


Theorem IV. If the same number of units is purchased at each 
transaction, that is, if q = c, then 


Average price = 


2pg _ ILpc _ c2p 
2g 2c Nc 

2 p 

N 


(14) 


which is the arithmetic mean of the prices. 

Example 4. When Mr. Brown purchased gasoline, he regularly pur- 
chased 10 gallons. He purchased 

at station A, 10 gallons at 140 per gal., 
at station B, 10 gallons at 180 per gal., 
at station C, 10 gallons at 150 per gal., and 
at station D, 10 gallons at 130 per gal. 

What was the average price per gallon? 

Solution : In this case we note that the same number of units, 10 gallons, 
was purchased at each station. Hence, by (14) we obtain 

4 . 14+18-fl5-f 13 

Average price = = — 

= ~ = 150 per gallon 

The student may verify this by formula (12). 


EXERCISES 

1. A young man took a trip by bicycle. He rode 8 hours each day. 
He traveled 32 miles the first day, 28 miles the second day, 24 miles the 
third day, and 20 miles the fourth day. What was his average speed? 

2. A man bought four kinds of apples at the following prices: 

5 bushels of the first kind at 400 per bu., 

5 bushels of the second kind at 500 per bu., 

5 bushels of the third kind at 750 per bu., and 
5 bushels of the fourth kind at $1.00 per bu. 

What was the average price per bushel? 



98 


MEASURES OF CENTRAL TENDENCY 


3. William Smith purchased gasoline from three dealers. He purchased 

from A, 20 gallons at 17^ per gallon, 
from B, 10 gallons at 11^ per gallon, and 
from C, 15 gallons at 15^ per gallon. 

What was the average price per gallon? 

4. Three ships make the same round-trip in 20, 24, and 30 days re- 
spectively. What was the average number of days for the trip? 

6. In a certain factory a unit of work is completed by A in 4 minutes, 
by B in 5 minutes, by C in 6 minutes, by D in 10 minutes, and by E in 
12 minutes. Find (a) the average number of units per hour, (b) the average 
number of minutes per unit, and (c) the total number of units they will 
complete in 8 hours. 

6. A man travels 20 miles at 40 miles per hour, 10 miles at 30 miles 
per hour, and 30 miles at 60 miles per hour. What was his average speed? 

7. Five boys were given a page of problems with the instruction to 
solve as many as they could in an hour. A solved 12, B solved 10, C 
solved 8, D solved 6, and E solved 4. What was the average number 
of problems per hour and the average number of minutes per problem? 

8. Given two unequal observations Xi and X 2 , prove 

M h < M g < M 

9. Given three unequal observations X h X 2 , and Xz, prove 

M h < M g < M 

10. a. Given N unequal observations X ly X 2 , X 3 , . . . , X N , prove 

M h < M g < M 


b. How do the three means compare if all the observations are equal 
in value? 

This is a fairly tough problem. For references, see Burgess, Mathe- 
matics of Statistics , page 101; Chrystal, Algebra , Part II, page 46; Hall 
and Knight, Higher Algebra, page 211. 

(1 + N) N 

11. Prove that the product of the first N integers is less than 


Hint. Use Exercise 10a above and Exercise 4a page 74. 

12 . Prove that the product of the first N odd integers is less than N N . 
Hint. Use Exercise 10a above and Exercise 4b page 74. 


29. DISCUSSION AND CRITICISM OF THE MEASURES 
OF CENTRAL TENDENCY 

Owing to the fact that many distributions tend to “pile up” near 
the center, we have chosen the term central tendency to describe this 



DISCUSSION AND CRITICISM 


99 


behavior. The measures of central tendency are statistical constants 
that give the striking featu re of the central, the predominant, the 
typical variates. The arithmetic mean, the median, and the mode 
are the most widely used. The arithmetic mean is that measure the 
algebraical sum of the deviations from which is equal to zero. The 
median is that quantity such that half of the observed measures 
exceed it in value and half are exceeded by it. We shall see later that 
the median is the point from which the sum of the absolute values 
of the deviations of all the measures from it is a minimum. The 
mode is the value at which the ideal frequency curve fitted to the 
given distribution has a maximum. 

All the measures of central tendency are called averages . Since 
the averages we have thus far considered are so different in their] 
meanings and since we shall n^eet other averages in succeeding 
chapters, to the statistician the term average is quite indefinite. In 
the consideration of the first exercise at the end of this chapter the 
student will have opportunity to observe that such terms as “the 
average student,” “the average-sized apple,” etcetera, do not connote 
the same to all. We should therefore speak definitely as to which 
average is meant when the term is used. 

A. The Arithmetic Mean. The arithmetic mean is probably the 
best understood of all the averages. To many people it is the average. 
It is easy to compute, is rigidly defined, is based on all the measures, 
and is well designed for algebraical manipulation. Arithmetic means 
of different series can be readily combined to determine the arithmetic 
mean of the entire group. The arithmetic mean can be determined 
if the total and the number of the items are known, and is useful in 
case a large weight is desired for the extreme measures. The arith- 
metic mean is especially admired for its stability or its reliability. 
If many samples are drawn from some parent population, the arith- 
metic means of the given samples will usually show less fluctuation 
than the other averages. We describe this property by saying that 
the arithmetic mean is a very reliable or a very stable average. 

Situations frequently arise where the emphasis upon the extreme 
measurements is undesirable. For example, the great wealth of one 
man in a community will unduly influence the arithmetic average 
of the wealth of the community, and thus the arithmetic mean will 
give a distorted picture of the average wealth of the community. 



100 MEASURES OF CENTRAL TENDENCY 

A disproportionately large salary paid to one employee of a group 
may cause the average salary of the group based upon the arithmetic 
mean to give an unfair impression of the salaries of the group as a 
whole. A hundred-dollar bill in a collection plate may cause the 
arithmetic average of the donations to appear absurdly large. 

B. The Median. The median is rigidly defined, is easy to deter- 
mine. It is based upon all the measures, each of which has equal 
influence, and it is not unduly influenced by the extreme measures. 
It follows that the median is useful wherever extreme items are of 
little importance. It is useful in characterizing groups of a non- 
mathematical character which we cannot measure and yet can ar- 
range them according to size. 

The median is not so well understood as the arithmetic mean or 
the mode, and it is not designed for further mathematical treatment. 
It shows a greater fluctuation from sample to sample and hence is 
generally less reliable than the arithmetic mean. 

A further objection to the median is its insensitivity. Thus we 
can replace certain measurements of a given group by other measure- 
ments without having any effect upon the median. Let us consider 
the series 1, 3, 5, 7, 9, 11, 13. For this series we have: 

M d = 7 and M - 7 

I may replace, for example, the three numbers which are larger than 7 
by three other numbers which are likewise larger than 7 and this 
replacement will have no effect upon the median. Thus the series 1, 
3, 5, 7, 16, 20, 32 has: 

M d = 7 and M = 12 

The student will discover other replacements that will in no way 
affect the median but may have tremendous effect upon the arithmetic 
mean. Exercise 19 at the end of this chapter is an illustration of the 
fact that shifting the positions of certain measurements of a group 
may have no effect whatever upon the median provided the median 
point is not crossed. 

C. The Mode. Though the technical term may not be well 
known, the concept of the mode is well understood and easily com- 
prehended. It is probably what the editors of our newspapers have 
in mind when they speak of “the average citizen.” Like the median, 
it is not greatly influenced by the extreme variates. Though the 



DISCUSSION AND CRITICISM 101 

true mode is difficult to determine, yet the term mode is so important 
that even an approximate mode is often satisfactory. An approxi- 
mate mode is not difficult to determine, and it owes its importance 
to the fact that it is located in the region where the frequency is most 
dense. It shows the most frequent measure. For a clothing mer- 
chant, the mode of a distribution of chest measurements is the impor- 
tant average. 

It frequently happens that a distribution has no well-defined mode, 
or there may be several apparent modes. The mode therefore has no 
meaning unless there is a decided central tendency. The mode is also 
insensitive. 

D. The Geometric Mean. The geometric mean is based on all 
the measures, is rigidly defined, is suited to algebraic manipulation, 
is not unduly influenced by extreme measures, and gives equal weight 
to equal rates of change. It may be appropriately used when emphasis 
is on the ratio between two quantities rather than on their absolute 
difference. 

The objections to the geometric mean are that it is not well under- 
stood, is difficult to compute, and is difficult for the non-mathema- 
tical student to comprehend. 

It is evident that no one measure of central tendency can be con- 
sidered as the best. Each measure is useful in shedding some light 
upon a given problem, and the best selection can be made only by 
the experienced statistician for the particular purpose he has in 
mind. The values of the averages considered depend entirely upon 
the discrimination with which they are used and interpreted. The 
arithmetic mean is perhaps the most useful. The ease of its computa- 
tion, its wide uses in later applications, and its familiarity to the 
general reader make it highly serviceable in statistical work. 

EXERCISES 

1. What average is meant in each of the following: the average student? 
the average citizen? the average amount of material in a dress pattern? the 
average-sized apple? the average annual rainfall? the average price of wheat? 
the average ability in arithmetic? the average height? the average length of life? 
the average speed of a train between two stops? the average salary of teachers 
in a state? the average number of bushels of corn per acre in a nation? 

2. A college student carries 15 hours of class work per week and make? 
the grades listed in the following table. What is his average grade? 



102 MEASURES OF CENTRAL TENDENCY 

Grades of Student Carrying 15 Hours of Class Work 


Course 

Semester Hours 
Credit 

Grades in 
Percentages 

English 

2 

88 

Mathematics 

5 

96 

Language 

5 

80 

Science 

3 

78 

Total 

15 



3. A man has $10,000 invested at 5 per cent, $5,000 at 6 per cent, and 
$3,000 at 8 per cent. What is his average rate of interest? 

4 . The following are the distributions of the scores of 334 Freshmen on 
an achievement test in English given at Bucknell University in September, 
1929. In (a) the class width is 15, whereas in (b) the class width is 10. 
What are the class boundaries? Compute M for (a) and (b). 


Scores of 334 Students in English 
(a) (b) 


X 

/(*) 

X 

fix) 

47 

3 

45.5 

1 

62 

7 

55.5 

3 

77 

15 

65.5 

6 

92 

20 

75.5 

12 

107 

22 

85.5 

9 

122 

37 

95.5 

14 

137 

45 

105.5 

13 

152 

41 

115.5 

22 

167 

55 

125.5 

24 

182 

27 

135.5 

28 

197 

30 

145.5 

24 

212 

11 

155.5 

34 

227 

15 

165.5 

42 

242 

6 

175.5 

25 



185.5 

15 

Total 

334 

195.5 

20 



205.5 

18 



215.5 

3 



225.5 

12 



235.5 

5 



245.5 

4 



Total 

334 



DISCUSSION AND CRITICISM 


103 


6 . Compute the medians for the distributions of Exercise 4. 

6. Compute the approximate modes by formula (5) for the distributions 
(a) and (b) of Exercise 4. 

7 . In the following table deaths from collisions of automobiles with rail- 
road trains and street cars are not included. 


Automobile Fatalities in the Entire Registration 
Area in Continental United States, 1911-1930 1 


Year 

Number of 
Deaths 

Year 

Number of 
Deaths 

1911 

1,291 

1921 

10,168 

1912 

1,758 

1922 

11,666 

1913 

2,488 

1923 

14,411 

1914 

2,826 

1924 

15,528 

1915 

3,978 

1925 

17,571 

1916 

5,193 

1926 

18,871 

1917 

6,724 

1927 

21,160 

1918 

7,525 

1928 

23,765 

1919 

7,968 

1929 

27,066 

1920 

9,103 

1930 

29,080 


Make a chart representing these data. Find the geometric mean of the 
annual rates of increase. Also find the geometric mean of the number of 
deaths. 

8. The following table gives the deaths from tuberculosis by ages. Note 
that the class intervals are not all equal. Using the result of the theorem 
in Exercise 7 on page 74, find the mean age of death from this cause. 


Deaths from Tuberculosis by Ages 2 


Age of Death 

Number Dying 

fix) 

Age of Death 

Number Dying 

fix) 

0- 4 

1,356 

30-34 

8,776 

5- 9 

537 

35-44 

15,456 

10-14 

1,278 

45-54 

11,060 

15-19 

6,300 

55-64 

7,455 

20-24 

10,911 

65-74 

4,788 

25-29 

10,349 

75-84 

1,866 


1 The data are taken from Statistical Abstract of the United States , 1936, p. 367. 

2 The data are taken from Mortality Statistics , 1928, p. 160. The original 
data have been altered somewhat, e.g., the Bureau's final classification was “75 
and over." 



104 


MEASURES OF CENTRAL TENDENCY 


9. In Exercise 8 we combined the number of deaths from 0 to 4 in- 
clusive into a single group. We felt justified in doing this because in so 
doing we did not conceal any outstanding facts. However, in the accom- 
panying table such a procedure would do violence to some outstanding 
facts. 


Deaths from Diphtheria by Ages 1 


Age of Death 

Number Dying 
fix) 

Age of Death 

Number Dying 
Six) 

Under 1 

602 

20-24 

89 

1- 2 

1,133 

25-29 

56 

2- 3 

1,183 

30-34 

67 

3- 4 

1,112 

35-44 

no 

4- 5 

913 

45-54 

60 

5- 9 

2,290 

55-64 

52 

10-14 

435 

65-74 

22 

15-19 

118 

75-84 

12 


Using the result of the theorem in Exercise 8 on page 74, find the mean 
age of death from this cause. Also find the median age of death. 

10 . The total number of divorces granted in the continental United 
States increased from 33,461 in 1890 to 195,939 in 1928. Assuming the 
annual rate of increase was constant, find its value. From this result, 
estimate the number of divorces granted in the years 1895, 1900, 1905, and 
1925. (See Statistical Abstract of the United States , 1930, p. 91.) The actual 
numbers given are: 1895, 40,387; 1900, 55,751; 1905, 67,976; 1925, 
170,952. 

11 . For $1 a person purchased each of the following amounts of the 
given articles: 

butter, 3 pounds potatoes, 40 pounds 

sugar, 20 pounds coffee, 4 pounds 

Find the average number of pounds for $1 and the average price per pound. 

12. Prove that the product of the ratios of each of N measures to their 
geometric mean is equal to unity. 

13 . Prove that the geometric mean of the ratios of corresponding 
measures in two series of N measures each is equal to the ratio of their 
geometric means. 

14 . The following table gives the number of pounds of sugar that could 
be bought for $1 in the given years: 


1 The data are taken from Mortality Statistics, 1928, p. 150. 



DISCUSSION AND CRITICISM 105 


Pounds of Sugar for $1, 1918-1922 


Year 

Pounds of Sugar 
Bought for $1 

1918 

10.3 

1919 

8.8 

1920 

5.2 

1921 

12.5 

1922 

13.7 


What is the average price per pound during this period? Get two answers. 

16. The following distributions are of eggs from Barred Plymouth Rock 
pullets. The measurements of length were recorded to the nearest milli- 
meter and those of breadth to the nearest half a millimeter. Are these 
tables consistent with our theory? What are the class boundaries? 


Lengths and Breadths of Random Sample of 450 


Eggs from 
( a) 

450 Pullets 1 

(b) 


Length in 
Millimeters 

X 

Frequency 

/(*) 

Breadth in 
Millimeters 

X 

Frequency 

fix) 

49.5 

1 

38.25 

2 

50.5 

1 

38.75 

4 

51.5 

7 

39.25 

9 

52.5 

22 

39.75 

18 

53.5 

36 

40.25 

41 

54.5 

71 

40.75 

52 

55.5 

68 

41.25 

41 

56.5 

77 

41.75 

65 

57.5 

78 

42.25 

73 

58.5 

35 

42.75 

48 

59.5 

29 

43.25 

41 

60.5 

10 

43.75 

26 

61.5 

4 

44.25 

15 

62.5 

3 

44.75 

7 

63.5 

6 

45.25 

5 

64.5 

1 

45.75 

2 

65.5 

0 

46.25 

1 

66.5 

67.5 

0 

1 

Total 

450 


Compute M for the distributions (a) and (b). 

1 The data are taken from Raymond Pearl and F. M. Surface, A Biometrical 
Study of Egg Production in the Domestic Fowl , Part III, p. 183. 



106 


MEASURES OF CENTRAL TENDENCY 


16 . Find the medians for (a) and (b) in Exercise 15. 

17 . The number of births during a year is 1/48 of the population at the 
beginning of the year and the number of deaths during a year is 1/60 of 
the population at the beginning of the year. Find the number of years 
for the given population to be doubled. 

18 . Compute column 3 for these data, and note that the ratio is approxi- 
mately constant. Find the geometric mean of the expenditures. 


Expenditure for Public Schools in the United States, 1909-1919 


Year 

Expenditure 
( millions ) 

X 

Ratio of Each 
Item to One above 

Log X 

1909-1910 

401.4 



1910-1911 

426.3 



1911-1912 

446.7 



1912-1913 

482.9 



1913-1914 

521.5 



1914-1915 

555.1 



1915-1916 

605.5 



1916-1917 

640.7 



1917-1918 

702.2 



1918-1919 

763.7 




19 . Here are data for two groups of laborers. Find the median wage for 
each group. Find the arithmetic mean wage of each group. Which is 
the “better-paid” group? 


Wages of Two Groups of Laborers 


Wages per Week 

Frequencies 

( dollars ) 

Group A 

Group B 

9.00-9.49 

2 

2 

8.50-8.99 

2 

2 

8.00-8.49 

10 

10 

7.50-7.99 

39 

39 

7.00-7.49 

20 

1 

6.50-6.99 

16 

16 

6.00-6.49 

6 

6 

5.50-5.99 

4 

4 

5.00-5.49 

1 

20 

Total 

100 

100 



DISCUSSION AND CRITICISM 


107 


20. The population of New York State increased at a constant annual 
rate from 9,114,000 in 1910 to 10,385,000 in 1920. What was the annual 
rate of increase? Assuming the same annual rate to continue during the 
period 1920 to 1925, estimate the population of New York State in 1925. 
Compare your estimate with the count of the State Census which was 
11,161,000. 

21. The number of bacteria in a certain culture was found to be 4(10 6 ) 
at noon of one day. At noon the next day, the number was found to be 
9(10 6 ). If the number increased at a constant hourly rate, how many 
bacteria were there at midnight? 

22. The price of an automobile decreased in value at a constant annual 
rate from $1,000 to $300 in five years. What was the annual rate of de^ 
crease? What was the value of the car at the end of three years? 


23 . 


Year 

Production 

(Thousands) 

x 

1908 

65 

1909 

131 

1910 

187 

1911 

210 

1912 

378 

1913 

485 

1914 

569 

1915 

970 

1916 

1618 

1917 

1874 


The accompanying table gives the pro- 
duction of motor vehicles in the United 
States for the years 1908 to 1917 inclusive. 
Find M and M 0 of the production. 


24 . The production of Portland Cement in the United States increased 
from 99 million barrels in 1921 to 176 million barrels in 1928. Assuming 
that the production increased at a constant annual rate, find the average 
annual rate of increase. 

25 . The population of Detroit increased at a constant annual rate from 
465,700 in 1910 to 993,700 in 1920. What was the average annual rate 
of increase? Assuming the same annual rate to continue during the period 
1920 to 1930, estimate the population of Detroit in 1930. Compare your 
estimate with the actual census report which gave 1,568,700. 

26 . If in 1930 the city of Detroit built a water system sufficient to supply 
a population of 2,500,000, how many years may elapse before the city 
finds it necessary to enlarge its water system? Base your estimate upon 
the three census reports. [See Exercise 25.] 

Hint: If a is the annual rate of increase the first decade, and b is the 
annual rate of increase the second decade, the average annual rate is 


x = V(1 + a )( 1 +b) - 1. 



108 


MEASURES OF CENTRAL TENDENCY 


27. During five successive years a certain investment earned 5 per cent, 
6 per cent, 6.5 per cent, 4 per cent, and 3.5 per cent. What was the 
average annual rate of increase? 

28. A does a unit of work in 20 minutes, and B does a unit of work in 
24 minutes. What is their average rate of working? 

29. The sales record of a certain firm showed the following items: 
800 articles at 10 cents; 400 articles at 25 cents; 300 articles at 50 cents. 
What was the average price per article? 

30. A man travels two miles, the first at a miles an hour and the second 
at b miles an hour. Show that his average rate is 

2 ab 
a -f- b 

miles an hour. What type of average is this? 

31. The annual wages earned by a group of 423 chief wage earners in 
families are given in the following table. (Houghteling, Leila: The Income 
and Standard of Living of Unskilled Laborers in Chicago, p. 27.) Compute 
M, Md f and M 0 (by fitting a parabola) for this distribution. 


Class 

X 

/(*) 

$800- 899 


6 

900- 999 


11 

1000-1099 


40 

1100-1199 


50 

1200-1299 


63 

1300-1399 


63 

1400-1499 


81 

1500-1599 


45 

1600-1699 


24 

1700-1799 


20 

1800-1899 


6 

1900-1999 


7 

2000-2099 


2 

2100-2199 


4 

2200-2299 


0 

2300-2399 


1 

Total 


423 


32. Milk is standardized according to its butterfat content. For ex- 
ample, ordinary legal milk is a 3 per cent milk, that is, 3 per cent of its 
weight is butterfat. If a farmer mixes 8 gallons of 3 per cent milk, 10 gal- 
lons of 2.9 per cent milk, 5 gallons of 3.5 per cent milk, 4 gallons of 5.3 per 
cent milk, what per cent butterfat is the mixture? What type of average 
is this? 



DISCUSSION AND CRITICISM 


109 


33. Compute M g for the data of Exercise 12, page 57. 

34. Which measure of central tendency would you use to summarize 
the frequency distribution of the following cases, and why? 

(1) Income of parents of Buckneli students. 

(2) Amount spent for food by Buckneli students. 

(3) Number of hours per week spent in outside preparation by Buckneli 
students. 

(4) Height of Buckneli men. 

(5) Weight of Buckneli women. 


35 . Exercise 12, page 74, gives three distributions of weekly wages in 
the clothing industry. Find M 0 for each distribution. 

36 . The annual salaries received by a group of Senior federal employees 
are given in the following table. (White: Public Administration , page 290.) 
Note that the class intervals are not all equal. 


Class 

X 

fix) 

mo 

and under 

1840 


2 

840 

it 

ll 

900 


5 

900 

u 

u 

1000 


18 

1000 

tt 

tt 

1100 


123 

1100 

u 

tt 

1200 


369 

1200 

tt 

it 

1320 


1208 

1320 

tt 

ft 

1440 


437 

1440 

tt 

tt 

1560 


63 

1560 

tt 

f( 

1800 


74 

1800 

tt 

u 

2000 


30 

2000 

it 

it 

2500 


5 

Total 


2334 


Compute M y M d and M Q 
(by fitting a parabola) for 
this distribution. Which 
average is the most appro- 
priate ? 


37 . A hardware company makes 7% on the invested capital the first 
year. The profit is added to the original capital, and 9% is made on the 
total investment the second year. Proceeding in this way, the profits 
are 10% the third year, 12% the fourth year, and 15% the fifth year. 
What is the average rate during the five-year period? 

38. The value in millions of dollars of exports from the U.S. in the given 
years are shown in the following table. Compute the geometric mean of 
the values of the exports. 


Year 

Value of Exports 

Year 

Value of Exports 

1885 

742.2 

1905 

1518.6 

1890 

857.8 

1910 

1745.0 

1895 

808.5 

1915 

2768.6 

1900 

1394.5 





110 


MEASURES OF CENTRAL TENDENCY 


39. A man wishes to travel two miles, the first at 30 miles an hour and 
the second at such a speed that his average speed over the two mile course 
will be 60 miles an hour. At what speed must he travel the second mile? 

40 . (For students who are familiar with determinants.) 1 Let (Xi, Fi), 
(X 2 , F 2 ) and (X 3 , F 3 ) be the coordinates of three points on the parabola 

Y = AX 2 + BX + C. Show that the quotient, — — > is given by 

AA. 


Xt 2 

Ft 

1 


Xx 

Fi 

1 

xs 

f 2 

1 

2 

x 2 

f 2 

1 

XJ 

F, 

1 


x 3 

f 3 

1 


Thus, if X 2 is the crude mode and X\ and X 3 the class marks of the 
adjacent classes with Xi < X 2 < X 3 , the above quotient gives the value 
of X of the approximate mode. 

41 . If the class interval is taken as a unit and x t ' = — i = 1, 2, 3, 

w 

show that we may obtain from Exercise 40 above the value of x' of the 
approximate mode to be 


1 

Fx 

1 


- 1 

Fi 

1 

0 

f 2 

1 

-5- 2 

0 

f 2 

1 

1 

f 3 

1 


1 

f 3 

1 


when x 2 = 0 is taken at the crude mode. 

42 . Supposing the frequencies of the X values are the terms of the 
expansion ( q + p) n as indicated in the table, find M if (q + p) ~ 1. 


X 

fix) 

0 

q n 

1 

nq n ~ l p 

2 

i 

i 

n\n — 1) „ „ 9 
_L_ L q n-y 

n 

pTl 


43 . Find the arithmetic, the geometric, and the harmonic means of 

the numbers: 1, 2, 4, 8, . . 2\ 

Tl ~|“ 1 

44 . Show that the median of the numbers 1, 2, 3, . . . , n is — - — 

L 


1 I am indebted to Dr. C. W. Bruce for suggesting this exercise. 



Chapter U 

MEASUREMENT OF DISPERSION 


30. THE INADEQUACY OF MEASURES 
OF CENTRAL TENDENCY 

The preceding chapters have called attention to the necessity of 
inventing summary numbers to characterize masses of numerical 
data. Chapter 3 has dealt with certain terse expressions, single 
magnitudes, by means of which we may obtain an understanding of 
the typical characteristics of the group as a whole. They represent 
the acme of condensation. The arithmetic mean, for example, repre- 
sents an average size of the measures, and is the value such that the 
algebraic sum of the deviations of the measures from it is zero. All 
must admit the value of the measures of central tendency, but we 
must come to realize their insufficiency. Two groups of measures 
may have the same mean 1 and yet differ widely. Consider the two 
groups below : 

Group I Group 11 

42 10 

45 22 

50 (the mean) 50 (the mean) 

55 78 

58 90 

The numbers in Group I are concentrated about their mean, 
whereas those of Group II are widely scattered. Similarly, we may 
have two groups of laborers with the same mean salary and yet their 
distributions may differ widely. The mean salary may not be so 
important a characteristic as the variation of the items from the 
mean. To the student of social affairs, the mean income is not so 
vitally important as to know how this income is distributed. Are a 
large number receiving the mean income or are there a few with 
enormous incomes and millions with incomes far below the mean? 

1 In what follows, when the term mean is used without a qualifying adjective, 
the arithmetic mean is meant. 


Ill 



112 


MEASUREMENT OF DISPERSION 


Figures 9, 10, and 11 represent frequency distributions with some 
of the characteristics we wish to emphasize here. The two curves in 
(a) represent two distributions with the same mean, M, but with 


Figure 9 




X 



INADEQUACY OF CENTRAL TENDENCY 113 

different dispersions. The two curves in (6) represent two distribu- 
tions with the same dispersion but with unequal means, Mi and ilf 2 . 
Finally, (c) represents two distributions with unequal means and 
unequal dispersions. 

The measures of central tendency are therefore insufficient. They 
must be supported by and supplemented with other measures. In 
this chapter, we shall be especially concerned with the measures of 
variability, or spread, or dispersion. A measure of dispersion is 
designed to state the extent to which the individual measures differ 
on the average from the mean. 1 In measuring dispersion we shall be 
interested in the amount of the variation or its degree but not in the 
direction , 2 * * * For example, a measure of 4 inches below the mean has 
just as much dispersion as a measure of 4 inches above the mean. 
The amount of variability, or absolute variability , will be expressed 
in concrete units, the same units that are used for the original variates, 
while the degree or relative variability, will be expressed in abstract 
numbers or ratios. A measure of absolute variation is useful in 
describing a single frequency distribution, but if two different dis- 
tributions are to be compared, difficulties are encountered. 

The real significance of the statements of the paragraph above will 
be comprehended as we proceed further into the chapter. The 
computation of the measures of variation for several distributions 
will convince us that a measure of absolute variation is significant 
only in proportion to the size of the thing varying. Therefore, for 
the comparison of the variation in two distributions, we shall find 
it necessary to define certain measures of relative variability. 

There are several measures of absolute variability to which we shall 
give attention. They are (1) the range, (2) the semi-interquartile 
range, (3) the mean deviation, (4) the standard deviation, and 
(5) the probable error. 8 As to measures of relative variability, we 
shall call attention to several, but we shall express our preference 
for the coefficient of variation , an invention of Professor Karl Pearson. 

1 Generally from the mean; infrequently from other measures of central 
tendency. 

2 The question of the direction of the variation will be answered in Chapter 5 

in connection with the skewness . 

8 A more extensive treatment of probable error will be found in Chapter 12. 

Also, see Section 35. 



114 


MEASUREMENT OF DISPERSION 


EXERCISES 


1* The heights of 11 men were 61, 64, 68, 69, 67, 68, 66, 70, 65, 67, and 
72 inches. If the shortest man is omitted, what is the percentage change 
in the range ? 

2. The weights of 11 forty-year-old men were 148, 154, 158, 160, 161, 
162, 166, 170, 182, 195, and 236 pounds. If the heaviest man is omitted, 
what is the percentage change in the range? 

3. The range of the heights of the 11 men considered in Exercise 1 is 
72 — 61 = 11 inches and the range of the weights of the men considered 
in Exercise 2 is 236 — 148 = 88 pounds. Can you determine from this 
information which shows the greater variation, the 11 measurements of 
height or the 11 measurements of weight? 

4. Find the ratio of the range to the mean in Exercise 1 and Exercise 2. 
If these ratios are used to measure the relative variations, can you answer 
the question proposed in Exercise 3? 

6 . A sample of 1515 college men was measured as to height. Their 
mean height was found to be 67.9 inches. What would you consider a 
reasonable variation on either side of the mean for such a set of data? 

6 . A sample of 1515 college men was measured as to weight. Their 
mean weight was found to be 138.9 pounds. What would you consider a 
reasonable variation on either side of the mean for such a set of data? 


X 

A 

/(*) 

B 

m 

2.5 

1 

0 

7.5 

2 

0 

12.5 

3 

0 

17.5 

5 

1 

22.5 

7 

3 

27.5 

8 

14 

32.5 

9 

17 

37.5 

9 

17 

42.5 

8 

14 

47.5 

7 

3 

52.5 

5 

1 

57.5 

3 

0 

62.5 

2 

0 

67.5 

1 

0 

Total 

70 

70 


Construct frequency polygons on the same 
sheet for distributions A and B. Compare their 
arithmetic means, their medians, and their 
modes. Do the measures of central tendency 
constitute a sufficient description of these 
groups? 


31. THE RANGE 

The simplest possible measure of the variation of a group of meas- 
ures is the range , that is, the difference between the highest recorded 



THE RANGE 


115 


score and the lowest recorded score. Since the range is determined 
by only the two extreme measures, it tells us nothing of the distribu- 
tion between these extremes; it tells us nothing about the concentra- 
tion of the measures about the center. 

Consider the distribution of heights in Exercise 1 (p. 54) and note 
that the one man in the tallest class increases the range about 10 per 
cent. Such an erratic measure is of little use for purposes of com- 
parison. We need a more stable measure. 


32. THE QUARTILE DEVIATION 

A measure of variation superior to the range is the quartile range 
or half of it, the semi-interquartile range, sometimes called the quartile 
deviation. The quartiles are the points on the X-scale that divide the 
distribution into four equal parts. Obviously, there are three quar- 
tiles, the second coinciding with the median. More precisely stated, 
the lower quartile, Q h is that point on the X-scale such that one- 
fourth of the total frequency is less than Qi and three-fourths are 
greater than Q i. The upper quartile, Q 3 , is that point on the X-scale 
such that three-fourths of the total frequency are below Q 3 and one- 
fourth is above it. Between Qi and Q 3 , then, are included one-half 
the total frequency. Since, under most circumstances, the central 
half of a distribution tends to be fairly typical, the quartile range 
Qz-Qi affords a convenient measure of absolute variation. The 
greater the quartile range, the greater the dispersion. 

It is customary to use one-half the quartile range as a measure of 
dispersion, and to it is given the name of semi-interquartile range 
We denote it by Q, and hence: 

Q = (i) 

We can determine the quartiles in a manner similar to that used 
in the determination of the median (see Section 25, p. 76). The class 
intervals in which the quartiles lie are called the quartile classes. 

area wN represents N measures 

lt wN (( N tf 
4 4 



116 


MEASUREMENT OF DISPERSION 


Let/i and/3 be the frequencies of the lower and upper quartile classes. 
Let 61 and 63 be the lower boundaries of these classes. 

Let n\ and ns be the accumulated frequencies of all classes below 
the lower and upper quartile classes respectively. 
w = the class width 
N = the total frequency 
zi = biQi 

Qi = the lower quartile 


Then in Figure 12 we have: 


area ABCDQi = 


wN 


ABbi + &1C.DQ1 


wN 

T 


From which: 


and 


niw + fi • zi = 


wN 

4 



Qi = hi + Zt = bi + 



( 2 ) 


Figube 12 
M 




THE QUARTILE DEVIATION 


117 


In a similar manner, by equating area ACMRSQz to 


3 wN 
4 


we obtain: 


(>3 = bs + 



( 3 ) 


If the median be designated by Q 2 , formula (4) of Section 25 (p. 77) 
may be written: 


Md =* Q 2 = b 2 + 



w 


and the three formulas may be written in the form: 


Q% = bi + 



i = 1, 2, 3 


It should be noted that the determination of Qi and Qz requires 
that we know the class boundaries of the classes that contain Qi 
and Q z . 

Therefore, to determine Qi we must first locate the class that 
contains Q h the Qi class. This done, we will then know N/ 4, n h 
b h and/i. To locate the Qi class we find N / 4, then begin at the lower 
end of the scale and add the frequencies of the successive classes until 
the lower boundary of the class containing Qi is reached. We then 
know fti, bi, and f h and thus can immediately find Qi. A similar 
statement may be made with regard to Q 3 . 

The quartile points may also be found by simple analysis without 
using formulas just as we found the median, Q 2 . The method is 
explained in Exercise 8 of the next appearing list of exercises. 

Returning to the data of Table 8 (p. 26), for an illustrative example, 
we have: 


N 
4 
3 N 
4 


125 


4 

93.75 


= 31.25 


The quartile class of Qi is the class of 67.5-72.5, and the quartile 
class for Qz is the class 77.5-82.5. Hence: 



118 MEASUREMENT OF DISPERSION 

bi = 67.5 and b 3 = 77.5 

«i = 23 and n 3 = 84 

/i = 24 and f 3 = 19 

Then: 

Qi = 67.5 + ( 3 -- 2 - ||~ - - - 3 ) 5 = 69.22 c.u. 

and 

Qs = 77.5 -j~ f jg ) 5 = 80.06 c.u. 

This gives the quartile range to be: 

Qs — Qi — 10.84 and Q = 5.42 c.u. 

In other words, half of the scores occupy a range of 10.84 on the 
centigrade scale, almost equally distributed on either side of the 
median. For, by Section 25 (p. 78), M d = Q 2 = 74.60, and therefore: 

M d — Qi = 74.60 - 69.22 = 5.38 c.u. 

Qz- M d = 80.06 - 74.60 = 5.46 c.u. 


As previously stated, the quartiles, and hence Q, are expressed in 

terms of the original units, but if we divide Q by — we have a 

quartile coefficient of dispersion which may be used to measure rela- 
tive variation. This coefficient is a ratio, a pure number less than 
unity in value, and hence in using it we may compare the variations in 
distributions of unlike units, as distributions of heights in inches with 
distributions of weights in kilograms. Designating the quartile 
coefficient of dispersion by V q , we have: 


y - Ql ~ Ql 

5 Qs + Qi 
In case the distribution is symmetrical: 

Q2 ~ Qi = Qs — Qi 

from which _ „ , Q 3 + Qi 

kli = M d = ^ 


and 


v = C3 ~ Qi 
2 Q 2 


( 4 ) 


In this case, the distance from the median to either Qj or Qi is called 
the probable deviation, sometimes loosely called the probable error. In 



THE QUARTILE DEVIATION 


119 


other words, the probable deviation is that distance which if laid off on 
either side of the median of a symmetrical distribution will include 50 
per cent of the measures. If the distribution is not only symmetrical 
but normal 1 (see Section 35, p. 134) , this distance is properly called the 
probable error. 

EXERCISES 

1 . Find V q for the distribution of college algebra grades as described 
in Table 8 (p. 26). State a use of this result. 

2 . Find Q and V q for the distributions of heights and weights as de- 
scribed in Exercise 1 (p. 54). Give meaning to your results. 

3 . The deciles are the points on the X -scale which divide the distribution 
into ten equal parts. If bi, 6 2 , . . . , b 9 be the lower boundaries of the 
decile classes; f h / 2 , . . . , / 9 be the frequencies of the decile classes, and 
Tii, n 2 , . . . , n 9 be the accumulated frequencies in all classes below the 
respective decile classes, and if D v be the t’th decile, show that: 


w, i = 1 , 2 , . . ., 9 

4 . Find the deciles for the distributions of English scores as described in 
Exercise 4, page 102. 

5 . Suggest some measures of absolute and relative variability based 
upon the deciles. 

6. The percentiles are the points on the X-scale which divide the dis- 
tribution into one hundred equal parts. If b h b 2 , ... ., 5 99 be the lower 
boundaries of the percentile classes; /i,/ 2 , . . . ,/ 9 9 the frequencies of the 
percentile classes; and n h n 2 , . . ., n 99 the accumulated frequencies in all 
classes below the respective percentile classes, and if 1\ be the ith percen- 
tile, show that: 



7 . Find the fifth, the fifteenth, and the seventy-fifth percentiles for the 
distribution in Exercise 2, page 54. 

8. The quartile points may be determined by simple arithmetic in a 
manner similar to that used in finding the median. (See p. 79.) Com- 
plete the outline on page 120. 

1 A normal distribution is one whose frequency curve is of the type y — Ce"*** 1 . 
Chapter 12 will be concerned with normal distributions. 




120 


MEASUREMENT OF DISPERSION 


Consider the adjacent distribution. By counting from the smaller 
X-values we determine 52.5 — 62.5 to be the Q x class. Below this class 
are 4 + 11 = 15 scores. We need to move up the scale above 52.5 a dis- 
tance Zi until we obtain 10 scores from the 32 scores of the Qi class, and 
thus have 15 + 10 = 25, or A/4. 


Class 

X 

m 

92.5-102.5 

97.5 

4 

82.5- 92.5 

87.5 

9 

72.5- 82.5 

77.5 

17 

62.5- 72.5 

67.5 

23 

52.5- 62.5 

57.5 

32 

42.5- 52.5 

47.5 

11 

32.5- 42.5 

37.5 

4 

Total 


100 


Distances _ 

oz.5 

Frequencies 

1 

0 Qi 

3: 

2 

T 
z 1 

i 52.5 

T 

10 

1 


1 


lL 

10 


10 

32 


zi = ( ) 


Q l = 52.5 + = ( 


) 


By the method employed here, find Q 3 of this distribution. 

9. What are the limiting values of the earnings of the middle half 
of each distribution of Exercise 12, page 74? 

10. Compute Q ly Q 3 , and Q for the distribution of head-breadths given 
in Exercise 2, page 54. Does M d ± Q give values coincident with Q 3 
and Qi? Can you suggest a reason? 

11 . Find Q 3 and Q x for the distribution of Exercise 3, page 42. Since 
this is a distribution of discrete variates, what meaning can you give to 
your computed values? 

12. What are the limiting values of the earnings of the middle half of 
the distribution given in Exercise 31, page 108? 

13. Does Md ± Q for the distribution of Exercise 12 above give values 
coincident with Q 3 and Qi? Can you suggest a reason? 

14. In what units are the following constants measured: M ) M d , M 0 , 
Range, Qi, Q 3 , Q, and V q l 

15. Derive the formulas for Q x and Q 3 by the method of Exercise 8, 
above. 


33. THE MEAN DEVIATION 

In the previous chapter we have shown it to be a property of the 
arithmetic mean that the algebraic sum of the deviations from it is 
zero. The algebraic sum of the deviations about any other measure 
of central tendency will probably be small. Further, we have em- 



THE MEAN DEVIATION 


121 


phasized in the beginning of this chapter that in measuring variability 
we are interested in the amount and not in the direction of the varia- 
tion. And, too, whatever constant is used to measure variability 
should be one that is based upon all the original measures. 

These considerations lead us to define the mean deviation as the 
mean of the absolute values 1 of the deviations of the separate meas- 
ures from some measure of central tendency. Although the mean de- 
viation is a minimum when taken about the median 2 — which is a 
splendid argument for insisting upon its being taken about that 
average — yet it is more frequently taken about the mean. If X 
is any measure and M the mean, then: 

1#r , u A „ 2|X-M|/(s) 

M.D. about M = — 1 

We have previously designated the deviation of any measure from 
the mean by x (see Figure 1, p. 73), that is: 

x = X - M 

hence M.D. about M = 2 


Similarly, we may define the mean deviation about the median by 
the formula: 


1/f ^ u 2 | X - M d | f(x) 

M.D. about M d = — 1 ^ — — - 

Of course if the numbers are not arranged in a frequency distribution, 
then considering each frequency as unity we have: 


M.D. about M = 
M.D. about Md = 


2 | x | 2 | X-M 

N N 

2 | X - M d | 

N 


Corresponding coefficients of relative dispersion may be found by 

1 The magnitude represented by a signed number is called the absolute value 
or the numerical value of the number, and is indicated by placing a vertical line 
on either side of the number. Thus the absolute value of -f 5 and of — 5 is 5; 
in symbols, | + 5 | = | — 5 | = 5. 

2 Yule and Kendall, op. cit., p. 145. 



122 MEASUREMENT OF DISPERSION 


dividing any mean deviation by the average about which it is taken. 
Thus: 


Vm.d . about M 


M.D. about M 
M 


For an illustrative example we shall compute the mean deviation 
about the mean for the distribution of the grades in college algebra. 
In Section 22 (p. 62) we computed the mean to be: 

M = 74.48 c.u. 

We then have: x = X — M = X — 74.48 


Table 22. Computing M,D . about M for the Grades in 
College Algebra 


X 

f(x) 

|x| = \X - M\ 

1*1 • Six) 

95 

4 

20.52 

82.08 

90 

6 

15.52 

93.12 

85 

12 

10.52 

126.24 

80 

19 

5.52 

104.88 

75 

37 

0.52 

19.24 

70 

24 

4.48 

107.52 

65 

11 

9.48 

104.28 

60 

6 

14.48 

86.88 

55 

4 

19.48 

77.92 

50 

2 

24.48 

48.96 

Total 

125 

I 

851.12 


M.D. about M = = 6.81 c.u. 


V M.D. about M = ^^g = 0.09143 = 9.1% 


Of the three measures of absolute variability that we have thus 
far considered, the mean deviation is the only one which has con- 
sidered the deviations of all the individual members from a given 
average. The range and the semi-interquartile range are distances 
that are not based upon the consideration of all the members of the 
distribution. The mean deviation, however, is based upon all the 
members of the group, is rigidly defined, is readily computed, and is 



THE MEAN DEVIATION 


123 


not difficult to comprehend. It gives due weight to the extreme items, 
and is an especially good measure to use with economic data. The 
artificial step of ignoring the signs of the deviations, of course, 
renders it useless in further mathematical treatment. 

It is a property of approximately normal distributions that the 
interval 

M 1 1 ( M.D . about M) 

includes about 58 per cent of the total frequency. For this distribu- 
tion of college algebra grades this interval extends from 74.5 — 6.8 
to 74.5 + 6.8, that is, from 67.7 to 81.3. 



By constructing a portion of the histogram of Table 8 and recalling 
that a class frequency is proportional to the area of the rectangle, 
we find that 


72.5 - 67.7 
5 


(24) + 37 + 


81.3 - 77.5 
5 


(19) 


23 + 37 + 14 = 74 


scores lie in this interval. This is 59 per cent of the total frequency, 
125, which checks the theory approximately. 

This example illustrates an important function of a measure of 
dispersion when it is combined with a measure of central tendency. 
They give a summarized description of the distribution because they 
make possible the determination of intervals that include rather 
definite proportions of the total frequency. Thus Md =t Q deter- 
mines an interval that includes about N/2 variates and M ± M.D . 
determines an interval that includes about 3N/5 variates. 



124 


MEASUREMENT OF DISPERSION 


EXERCISES 

1. Find Xx y X |x|, and M.D. about M for each set of numbers: 
(a) (b) (c) 


X 

X 

1*1 x 

X 

i*i x 

X 

M r 

3 


62 


124 



5 


68 


146 



13 


74 


162 



20 


76 


178 



27 


88 


190 



58 


94 


220 




2. Statistical data of the United States Department of Agriculture 
show the following average yields in bushels per acre for the three specified 
crops. Compute M.D . about M. 


Year 

Wheat 

Rye 

Oats 

Year 

Wheat 

Rye 

Oats 

1923 

13.3 

11.3 

30.5 

1928 

15.4 

11.7 

32.9 

1924 

16.0 

15.0 

34.0 

1929 

13.0 

11.4 

29.3 

1925 

12.8 

11.3 

31.9 

1930 

14.2 

12.8 

32.2 

1926 

14.7 

10.3 

26.6 

1931 

16.3 

10.4 

28.1 

1927 

14.7 

15.1 

27.1 

1932 

13.0 

12.2 

30.1 


3. Complete the following table. Find M.D . about M. What per 
cent of the total frequency is included in the interval M ± M.D . ? 


Class 

X 

/(*) 

x' 

*'/(*) 

X 

xf(x) 

!*/(*)! 

92.5- 102.5 

82.5- 92.5 

72.5- 82.5 

62.5- 72.5 

52.5- 62.5 

42.5- 52.5 

32.5- 42.5 

97.5 

87.5 

77.5 

67.5 

57.5 

47.5 

37.5 

4 
11 
32 
25 
15 

8 

5 

0 





Total 


100 







4 . Each of two marksmen A and B fires 10 shots at a horizontal line XY. 
Their records are indicated by the following diagrams. Basing your con- 
clusion upon the mean deviation, can you determine who made the better 
record? 



THE MEAN DEVIATION 


125 



34. THE STANDARD DEVIATION 

Unquestionably the most universally used measure of dispersion 
is the standard deviation. It is usually denoted by ax (sigma), 1 and 
iB defined as the square root of the mean of the squares of all the in- 
dividual deviations measured from the arithmetic mean. Expressed 
as a formula, this definition becomes: 

If the original measures are grouped in a frequency distribution, 
the definition becomes: 

( 6 ) 

It will be noted that the squaring of the deviations removes the 
objectionable feature of signs noted in the preceding section when 
discussing the mean deviation. Further, the squaring gives added 
weight to the extreme measures, a desirable feature for some types of 
data. It should also be noted that taking the square root of the mean 
of the squared deviations leaves a expressed in the original unit of 
measure. 

Formulas ( 5 ) and (6) should be learned in several forms, thus: 
a 2 = , JV02 — Sx 2 /(x), etc. 

1 We shall generally omit the subscripts, employing them only when neces- 
sary, as in theoretical developments and for purposes of identification. [See 
p. 61 J 



126 


MEASUREMENT OF DISPERSION 


Unless otherwise stated, the standard deviation is always computed 
with the deviations measured from the arithmetic mean. This is 
due to the theorem that the sum of the squares of the deviations 
about M is less than if taken at any other point. We shall soon 
prove this theorem. 

The quantity 

ni - 

N 

is usually spoken of as the second moment — since each deviation is 
squared — of the distribution about the mean expressed in (original 
units) 2 , and is designated by v 2 (read: nu two). Hence: 

(7 2 = V 2 

The quantity, <r 2 , is also known as the variance of the distribution. 

The computation of a from the definition, or formula (5), is a 
decidedly simple though sometimes tedious matter. Let us consider 
the familiar distribution of college algebra marks as previously con- 
sidered in Tables 8, 15, and 17. The arithmetic mean has been 
found to be 74.48. The following table shows the steps involved. 


Table 23. Computing <r for the Distribution of Grades in 
College Algebra by the Definition M = 74.48 


X 

A*) 

x = X - M 

X 2 

x*f(x) 

95 

4 

20.52 

421.0704 

1,684.2816 

90 

6 

15.52 

240.8704 

1,445.2224 

85 

12 

10.52 

110.6704 

1,328.0448 

80 

19 

5.52 

30.4704 

578.9376 

75 

37 

0.52 

0.2704 

10.0048 

70 

24 

- 4.48 

20.0704 

481.6896 

65 

11 

- 9.48 

89.8704 

988.5744 

60 

6 

- 14.48 

209.6704 

1,258.0224 

55 

4 

- 19.48 

379.4704 

1,517.8816 

50 

2 

- 24.48 

599.2704 

1,198.5408 

Total 

125 



10,491.2000 


a 2 = p t = — = 83.9296 (c.u.) 2 

lZO 

a — Vv 2 = 9.16 = 9.2 c.u. (approximately) 



THE STANDARD DEVIATION 


127 


It may frequently happen that the measures are not sufficiently 
numerous to warrant their arrangement in a frequency distribution. 

Thus, consider the 10 scores in 


X 

x = X - 81 

X* 

86 

5 

25 

93 

12 

! 144 

73 

- 8 

| 64 

66 

- 15 

225 

88 

7 

49 

96 

15 

225 

80 

- 1 

1 

70 

- 11 

121 

95 

14 

196 

63 

- 18 

| 324 

2X = 810 

00 

1374 

M = 81 




centigrade units that were made 
on a certain test by 10 students of 
algebra. The scores are given in 
column one of the table. To apply 
formula (5) to these values we 
proceed, as the table shows, to find 
M and then x corresponding to 
each score. We have 

iV = 10 SX = 810 M = 81 c.u. 

2x 2 * * = 1374 



EXERCISES 

1. Statistical data of the United States Department of Agriculture show 
the following average yields in bushels per acre for the three specified 
crops. Compute cr for each grain. 


Year 

Wheat 

Rye 

Oats 

Year 

Wheat 

Rye 

Oats 

1923 

13.3 

11.3 

30.5 

1928 

15.4 

11.7 

32.9 

1924 

16.0 

15.0 

34.0 

1929 

13.0 

11.4 

29.3 

1925 

12.8 

11.3 

31.9 

1930 

14.2 

12.8 

32.2 

1926 

14.7 

10.3 

26.6 

1931 

16.3 

10.4 

28.1 

1927 

14.7 

15.1 

| 27.1 

1932 

13.0 

12.2 

30.1 


How many of the given 10 values are included in the interval M ± < 7 ? 
Test for each grain. 

2. a. Prove: M x +a - M x + A 
State this theorem in words. 

b. Prove: M x - A = M x — A 
State this theorem in words. 

c. Prove: <r x+A = c x 

d. Prove: <r x ~ A — <? x 

e. Prove: Z[X - M]» - ZX* - NM * = 2X 5 - 



128 


MEASUREMENT OF DISPERSION 


3. Compute a for the given distribution. 


Class 

X 

fix) 

32.5-37.5 

35 

2 

27.5-32.5 

30 

8 

22.5-27.5 

25 

12 

17.5-22.5 

20 

26 

12.5-17.5 

15 

16 

7.5-12.5 

10 

6 

2.5- 7.5 

5 

2 

Total 


72 


Owing to the fact that the value M usually comes out decimally, 
computing a by formula (6) is usually laborious, even tedious, hence 
we are driven to seek other methods. We shall develop two other im- 
portant methods for computing cr. The first method will express <r in 
terms of the original variates, Xi, and the second will express cr in 
terms of x\, deviations in class units of X t from the arbitrary origin. 

Referring to Figure 1 (p. 73), we note that: 


Hence: 


But: 


x = X — M 

, S x 2 f(x) 2(X - MYf(x) 

= (T =— = N 

2(X 2 - 2 MX + M 2 )f(x) 

= N 

SP/(i) 2MXXf(x) M 2 2f(x) 
" N N + N 

2X {f X - = M and 2/(z) = N 


Therefore: 


cr 2 


_ „ _ SX 2 /(*) 
5 N 


2M i + M 2 


?m-M> 


from which we obtain: 


cr 


. firm w , /MM mm 

"V JV m ~\ N N 


( 7 ) 



THE STANDARD DEVIATION 


129 


This formula gives a straightforward method for computing a 
when the original values of X are not too large or when a table of 
squares is accessible. We shall illustrate the use of the formula for 
the distribution of college algebra grades. As a matter of fact this 
table is a continuation of Table 15 (p. 62). 


Table 24. Computing a of the Grades in College Algebra by (7) 


X 

fix) 

Xf(x) 

X*f(x) 

95 

4 

380 

36,100 

90 

6 

540 

48,600 

85 

12 

1,020 

86,700 

80 

19 

1,520 

121,600 

75 

37 

2,775 

208,125 

70 

24 

1,680 

117,600 

65 

11 

715 

46,475 

60 

6 

360 

21,600 

55 

4 

220 

12,100 

50 

2 

100 

5,000 

Total 

125 

9,310 

703,900 


M = 


9310 

125 


74.48 


M 2 = 5547.2704 


XX 2 f(x) 703900 
N 125 


5631.2 


a = V563L2 - 5547.2704 = V83.9296 = 9.16 c.u. 


A third, and still more useful, method for computing a will now 
be established. The method is analogous to that used in deriving 
formula (3) of Section 24 (p. 71). From Figure 1 (p. 73) we have 

x + wb x = wx' or x = w(x' — b z ) 

where w, x' y and b x are defined as in Section 24: 

, r 2 x 2 f(x) 2 [>(x' - b x )Jf(x) 

c = = ~r~ = n 

_ ...,rs *”/(*) 2bXx’f(x) 6l2/(x)-| 

_t "L iV N + N J 

Recalling that — - b x and 2/(x) = N, we have: 



130 


MEASUREMENT OF DISPERSION 


- «] 

( 8 ) 

. = - * 

( 9 ) 


Computing a by (9), which is based upon the class interval as a unit 
of measure, we shall call the short method for computing the standard 

Table 25. Computing a for the Grades in College Algebra by (9) 


X 

fix) 

X - 75 

X 5 

x'f(x) 

xj(x) 

95 

4 

4 

16 

64 

90 

6 

3 

18 

54 

85 

12 

2 

24 

48 

80 

19 

1 

19 

19 

75 

37 

0 

0 

0 

70 

24 

- 1 

- 24 

24 

65 

11 

- 2 

- 22 

44 

60 

6 

- 3 

- 18 

54 

55 

4 

- 4 

- 16 

64 

50 

2 

- 5 

- 10 

50 

Total 

125 


- 13 

421 


deviation. We shall illustrate its use by computing a for the dis- 
tribution of the grades in college algebra. It will be noted that 
Table 25 is a mere continuation of Table 17 (p. 73). 

h = 75, w = 5, N *= 125 

b x = — = - 0.104 bl = 0.010816 

12o 

M = 75 + 5(- 0.104) - 74.48 c.u. 

Ss /2 /(s) _ 421 
N 125 

a = 5V3.368 - 0.010816 = 5V3.357184 - 5(1.832) = 9.16 c.u. 

The observant student will note that in computing a we have the 
quantities needed to compute M. 

The quantity — ^ — - is usually denoted in statistics by v't (read: 



THE STANDARD DEVIATION 131 


nu two prime), and is called the second moment about the arbitrary 
origin expressed in ( class units) 2 . Hence: 

a 2 = v 2 ~ w 2 (v 2 — v'i 2 ) 


If we write formula (8) in the form 

«* - _ („*,>> 


or 

Na 2 = 2 (wx') 2 f(x) - N(wb x ) 2 

a careful interpretation leads to an important theorem to which 
attention has previously been called. For Na 2 = 2 x 2 f(x) is the sum 
of the squares of the deviations of the variates about the mean; 
2(wx') 2 /0 c ) is the sum of the squares of the deviations of the variates 
about any point, and N(wb x ) 2 is a positive quantity. Hence the 
theorem: the sum of the squares of the deviations of the variates about 
the mean is less than the sum of the squares of the deviations about 
any other point. [See Exercise 28 at end of chapter.] 

If dispersion is to be measured by the root-mean-square deviation 
about some point, the above theorem recommends our taking M 
for that point, for it is about M that the root-mean-square deviation 
has a unique value. 

The coefficient of relative dispersion based upon the standard 
deviation is known as the coefficient of variation , and is defined by the 
formula: 


V* 



( 10 ) 


and is usually expressed as a percentage. That is, the variability 
is expressed as a certain per cent of the mean. 

A word of comment at this point with regard to formula (10) in 
particular and to relative variation in general may be desirable. The 
arbitrary ratio of the standard deviation to the arithmetic mean as a 
measure of relative variation as well as the other ratios that we have 
used, e.g. formula (4), seems to be based more on psychological than 
on logical grounds. 

Despite individual variation that we have noted among statistical 
phenomena, we have learned from experience to formulate judgments 
of the individual of normal size. That is, the establishment of norms 



132 MEASUREMENT OF DISPERSION 

seems to be a natural process. We hear the expressions: “What a 
large apple!” “My, isn't she tiny?” “How emaciated he is!” 
“What a tremendous ear of corn!” “Wasn't that a hard rain?” 
“What a hot day in May!”. All these expressions imply the notion 
of a norm as well as variation from a norm. 

We have also formed judgments, which at this time may be crude 
and inadequate, of relative variation with respect to the norm. Any 
student, without using his statistical analysis, knows that a nose one 
inch longer than the average length of noses is more monstrous than 
a height that is one inch longer than the average of the heights. In 
other words, a variation is large or small depending upon the norm 
with which it is associated . 

Doubtless such considerations as the above led Professor Karl 
Pearson to define the coefficient of variation as the ratio of the 
standard deviation to the arithmetic mean. The arithmetic mean 
is taken to be the norm, and the standard deviation measures the 
variation from the norm. 

We should develop a statistical alertness to relative variation in 
characters that are less familiar. Thus, we have found for a distri- 
bution of weights of college men that M = 138.9 lbs. We shall 
find that cr = 17.2 lbs., and hence V <r = 17.2/138.9 = 0.124 = 12.4%. 
That is, for a group of weights of young men, the standard deviation 
is about 12.5 per cent, or one-eighth, of the mean. The heights of 
these same men will give M = 67.9 in. and a = 2.4 in., and hence 
Va = 2.4/67.9 = 0.035 = 3.5%. That is, for a group of heights 
of young men the standard deviation is about 3.5 per cent of the 
mean. A distribution of weights, then, shows much more variation 
than a distribution of heights. 

The general literature of biometry records coefficients of variation 
for many characters. We present herewith a few of them. 


Character 

V a 

Character 

Va 

Visual acuity 

39.12 

Pulse rate per min. 

14.89 

Wt. of heart 

32.39 

Chest circumference 

8.45 

(unhealthy) 


Length of forearm 

5.24 

Grip, right hand 

25.93 

Length of foot 

4.59 

Wt. of heart 
(healthy) 

17.71 

-Stature (English) 

3.99 



THE STANDARD DEVIATION 


133 


Economic data generally show a much larger variation than do 
biometric data. (Of course the coefficients of variation for much 
economic data will not remain constant but will vary from time to 
time.) The weekly earnings of 72,000 Illinois coal miners were 
analyzed. The analysis gave M = $8.37, <j = $2.49, and V a = 29.7. 
An analysis of the price of potatoes gave M = 54.4 cents, <s = 11.11 
cents, and V a = 19. The variation in economic phenomena will be 
especially considered in Chapter 6 on Index Numbers. 

EXERCISES 

1. What norm was used in the development of the formula for F fl ? 

2 . Compute a for the earnings of each group of Exercise 12, p. 74. 
The earnings of which group show the greater dispersion? 

3 . Compute <r for the distribution of the salaries of federal employees 
that is given in Exercise 36, p. 109. Is it possible to apply formula (9) to 
this distribution? 

4 . Compute cr for the distribution of the annual wages of chief wage 
earners that is given in Exercise 31, p. 108. What is the coefficient of varia- 
tion for this distribution? 

6. Table A gives the I.Q.'s of 905 school children. Table B gives the 
weights of 1,000 school children. For each distribution find: ilf, M d , 
Mo, o\ Does the interval M ± 3o- include all the variates of each dis- 
tribution? 


Table A 


Table B 


X 

fix) 

60.5 

3 

70.5 

21 

80.5 

78 

90.5 

182 

100.5 

305 

110.5 

209 

120.5 

81 

130.5 

21 

140.5 

5 

Total 

905 


X 

/(*) 

29.5 

1 

33.5 

14 

37.5 

56 

41.5 

172 

45.5 

245 

49.5 

263 

53.5 

156 

57.5 

67 

61.5 

23 

65.5 

3 

Total 

1000 ’ 


6 . Derive formula (9) from formula (7). 



134 


MEASUREMENT OF DISPERSION 


35. THE NORMAL CURVE 1 

In Section 19 (p. 51) reference was made to the normal distribution 
and to the general form of the equation that represents it. This 
curve is so important in statistical work, both theoretical and applied, 
that, although we discuss it rather fully in Chapter 12, we desire at 
this point to call attention to some of its properties. The general form 
of the curve is shown by curve (b) of Chart 8 and by the curves (a), 
(b), and (c) of Section 30 (p. 111). The normal curve is characterized 
by the symmetrical arrangement of all the variates with respect to a 
line through the central value, most of the observations lying close 
to the mean and very few differing from it considerably. 

The normal curve is of importance to us just now in that its proper- 
ties will assist us in making certain generalizations about distributions 
that do not differ too markedly from normality. And such distribu- 
tions are not at all rare. Measurements of natural objects — such 
as the lengths of the leaves on a tree, the heights of men, the lengths 
of bean pods, the breadths of the heads of men, the lengths and 
breadths of nuts — distribute themselves with a surprising closeness to 
normality if large samples are taken. In an approximately normal 
distribution of a thousand observations we can estimate with sur- 
prising accuracy the number that differ from the mean by definite 
amounts, say a, 2a, 3 a, etc. In fact these relations are so regular 
with the measurements of natural objects that those which are so 
distributed are said to be normal. As has been previously noted, 
many data collected from the fields of psychology and education are 
also of this type. 

Chart 9 shows a normal curve. The mean, median, and mode 
coincide at 0. It has a maximum at the center and is symmetrical 
with respect to the vertical line through 0. The curve crosses its 
tangent, that is, the curve changes from concave to convex, at 
points Ii and / 2 which are at a distance a from the vertical through 
0. The curve approaches the X-axis as x gets large, though we sel- 
dom extend it beyond 3 a in either direction from 0 because the 
number of such deviations outside M =b 3 or is relatively insignificant. 

We have laid off certain multiples of a on either side of the mean. 
It will be proved in Chapter 12 that: 

1 If the reader desires to know more about the normal curve, its history and 
importance, he should read Section 101. 



THE NORMAL CURVE 
Chart 9 


135 


A Normal Curve 



The interval from M — a to M + a 

includes approximately $ N. 

The interval from M — 2a to M + 2a 

includes approximately 95 per cent of N. 
The interval from M — 3a to M + 3ct 

includes approximately 99 per cent of N . 

Further, it will be shown that: 

The range equals 6 a approximately. 

Q equals §<r approximately. 

M.D. from M equals %a approximately. 

Of course as an observed distribution departs from normality, the 
approximations are less close. 

The number of units of a that must be laid off on either side 
of M of a normal distribution to include the total frequency, N , 
varies with N. If N is very large, more than ± 3 a is necessary 
whereas if N is small less than dt 3cr is needed. The following table 
gives the interval that includes N for a normal distribution. 







136 


MEASUREMENT OF DISPERSION 


N 

Interval 

N 

Interval 

10 

M ± 1.65o- 

200 

M ± 2.81a 

20 

M ± 1.96a 

500 

M ± 3.0a 

30 

M ± 2.13a 

1,000 

M ± 3.3a 

50 

M zfc 2.33a 

10,000 

M ± 3.9a 

100 

M ± 2.58a 

100,000 

M ± 4.4a 


For the distribution of college algebra grades we have found: 

M = 74.48 c.u. M — cr = 65.32 c.u. 

a = 9.16 c.u. M + <j = 83.64 c.u. 

How many of the 125 grades of the sample lie in this interval from 
65.32 to 83.64? 

To assist us in answering the question let us construct the histogram 
for the central portion of Table 8 (p. 26). 


Figure 13 



The interval evidently includes the total frequencies of the three 
central groups (24 + 37 + 19 = 80), and an undetermined part of 
the classes designated by the class marks 65 and 85. From 65.32 
to 67.5 is 2.18, and since the variates are uniformly distributed over 
the interval we must include (2.18/5)11 = 4.79. Similarly, from 82.5 
to 83.64 is 1.14, and hence we must include (1.14/5)12 = 2.73. 
Hence the interval from 65.32 to 83.64 includes 80 + 4.79 + 2.73 
= 87.52, or about 70 per cent of the 125 variates. The result here is 
more than §AT for the reason that our distribution is loaded at 75. 



THE NORMAL CURVE 


137 


When dealing with a distribution of discrete variates, interpola- 
tion is usually not necessary. For example consider the distribution 
given in Exercise 3 (p. 42). We have previously computed for this 
distribution : 

M = 53.67 c.u. 

a = 2.16 “ 

M - a = 51.51 “ 

M + or = 55.83 “ 

The interval from 51.51 to 55.83 includes the frequencies with class 
marks at 52, 53, 54, and 55, that is, a total of 468 ( = 96 + 134 + 127 
+ 111) or 66.57 per cent of N. 


36. THE PROBABLE ERROR 

A measure of dispersion that particularly relates to the normal 
curve is the probable error / Ex- It is a distance which, when laid 
off on either side of the mean of a normal curve, defines an interval 
that includes one half the total area under the curve. Stated some- 
what differently, the probable error of a distribution of variates 
normally distributed is that deviation on either side of the mean 
within which half the variates lie. Then, since half the total fre- 
quency lies within the interval Mx — Ex to M x + Ex, it is an even 
chance that a variate selected at random falls within this interval. 

The following figure may assist in clarifying the probable error 
concept. This figure shows the per cent of the total frequency that 
is included by the indicated probable error units. 

The probable error is closely related to the standard deviation. 
The relationship is indicated by the equation 

E x = 0.6745 ctx (11) 

Approximately then, Ex is about %ax and ax is about §Ex- If a 
distribution is not normal , it is customary to define the probable error 
by (11). 

Since any multiple of a can be expressed in terms of E , and vice 
versa, it is natural to inquire why we have both and what are the 

1 Generally we shall omit the subscript, employing it only for identifications. 



138 MEASUREMENT OF DISPERSION 

Figure 14 



advantages of E. The standard deviation a, or a synonym for it, 
is the older measure. The early nineteenth-century astronomers, 
particularly Bessel in 1815 and Gauss in 1816 — who were among 
the first men to work with statistical analysis — desired an interval 
within and outside of which it is equally probable that a random 
measurement of a normal distribution will occur. This interval on 
the X-scale is from M — E to M + E, or from M — 0.674 5a to 
M + 0.6745(7. Bessel first used the term probable error, Gauss 
and the contemporary writers liked the term, and so tradition has 
kept it in use to this day. 

Of course there is a facility of language when using the probable 
error that may account for its popularity. For example, it is an 
“even chance,” a “fifty-fifty chance,” or a “one-to-one shot” that 
a measure selected at random from a normal distribution falls within 
or without M ± E. In other words it is as “likely as not” that a 
measure selected at random from a group of normally distributed 
variates will fall within the interval M ± E. Equally simple lan- 
guage does not obtain when using the standard deviation. Thus, 
assume a distribution of the heights of 1,500 men with M = 67.5 in., 
7 = 2.5 in., and E = 0.6745 (2.5) = 1.7 in. Then if a measure is 
selected at random from this group it is as likely as not to fall within 
the interval 67.5 ±1.7 inches. 



THE PROBABLE ERROR 


189 


EXERCISES 

1. The data of the following tables are taken from Bulletin No. 620 
of the U.S. Department of Labor, “Wages, Hours, and Working Condi- 
tions in the Folding-Paper-Box Industry, 1933, 1934, 1935.” They present 
the hourly earnings of employees in the U.S. in the paper-box industry. 

Compute M , <r, and E for each table. 


X 

{cents) 


May, 19SS 
/(*> 

August, 1934 

m 

August, 19S5 
/(x) 

10 a.u. 

15 

57 

1 

0 

15 a.u. 

20 

168 

2 

8 

20 a u. 

25 

538 

9 

19 

25 a.u. 

30 

507 

20 

96 

30 a.u. 

35 

622 

231 

286 

35 a.u. 

40 

541 

1371 

1332 

40 a.u. 

45 

485 

1834 

1670 

45 a.u. 

50 

357 

919 

969 

50 a.u. 

55 

328 

729 

806 

55 a.u. 

60 

172 

484 

563 

60 a.u. 

70 

327 

739 

758 

70 a.u. 

80 

211 

420 

457 

80 a.u. 100 

174 

533 

555 

100 a.u. 

120 

42 

224 

264 

120 a.u. 

150 

17 

85 

82 

Total 

4546 

7601 

7865 


2. Let the five numbers 3, 4, 5, 6, 7 be a universe. Select different 
samples of three from these five numbers, 10 samples in all, and compute 
their means. Thus 


M , 


34-4+5 
— 7 M 2 


3+4+6 1# 3+4+7 

> Mz = ^ > 


M\o = 


5+6+7 

3 


a. Find the mean of the 10 sample means. How does it compare with 
the mean of the universe? 

b. Find the standard deviation of the 10 sample means. How does it 
compare with the standard deviation of the universe? 

3. Consider the universe of numbers 5, 10, 15, 20, 25. Treat these 
numbers as you did those of Exercise 2. 

4 . The problem of sampling has been called by Karl Pearson the funda- 
mental problem in statistics. Often our only statistical knowledge of the 
parent population is obtained from a study of samples drawn from it. 



140 


MEASUREMENT OF DISPERSION 


Stated in rather general terms, the question is: how well does the sample 
describe the parent population? More precisely, the problem is: how do 
M and a computed from the sample compare with M and <r computed 
from the parent population? 

Table 26(a) presents 10 sample distributions of 100 each. The 
parent population consists of 1,000 individuals. (By assigning the 
sample distributions to various members of the class the computa- 
tional labor will be greatly lightened.) 

(a) Find M and a for each sample and for the total. 

(b) Find the standard deviation of the 10 means of the samples. 

(c) How many of the means are included in the interval (Mean of means) 
db (standard deviation of means)? 

(d) Compare the mean of the 10 means with the mean of the total. 


Table 26(a). Distribution of the Weights of 1,000 Male Students 
(Measurements to nearest pound) 


Class Mark 
( Pounds ) 

Frequencies 

1st 

100 

2nd 

100 

3rd 

100 

4lh 

100 

5th 

100 

6 th 
100 

7th 

100 

8th 

100 

9th 

100 

10th 

100 

Total 

95.25 



1 




1 




2 

105.25 

2 

5 

2 

4 


1 

3 

3 


1 

21 

115.25 

ii 

11 

13 

9 

10 

9 

9 

14 

9 

9 

104 

125.25 

14 

24 

15 

21 

19 

25 

20 

14 

21 

23 

196 

135.25 

26 

16 

31 

19 

31 

21 

30 

22 

27 

25 

248 

145.25 

23 

17 

16 

26 

21 

19 

17 

21 

18 

19 

197 

155.25 

11 

16 

11 

n 

13 

16 

12 

17 

11 

15 

133 

165.25 

2 

5 

3 

5 

4 

5 

5 

5 

9 

4 

47 

175.25 

5 

1 

5 

0 

1 

1 

2 

4 

2 

4 

25 

185.25 

3 

3 

1 

4 

0 

1 

1 


1 


14 

195.25 

1 

1 

1 

1 

0 

2 



1 


7 

205.25 

0 

1 

1 


1 




1 


4 

215.25 

0 










0 

225.25 

0 










0 

235.25 

1 










1 

245.25 

1 










1 

Total 

100 

100 

100 

100 

100 

100 

100 

100 

100 

100 

1000 

M 












<T 














THE PROBABLE ERROR 


141 


5, Treat the data in Table 26(b) as you did those of Table 26(a). 


Table 26(b). Distribution of the Heights of 1000 Male Students 
(Measurements to nearest ^ inch) 


Class Mark 
(Inches) 

Frequencies 

1st 

100 

2nd 

100 

3rd 

100 

4th 

100 

5th 

100 

6th 

100 

7th 

100 

8th 

100 

9th 

100 

10th 

100 

Total 

59.45 


1 









1 

60.45 

1 

0 









1 

61.45 

i 

0 


2 



2 

2 



7 

62.45 

3 

3 

1 

2 

1 

5 

2 

1 



18 

63.45 

6 

6 

4 

5 

3 

3 

0 

2 

3 

1 

33 

64.45 

6 

4 

9 

3 

6 

11 

3 

6 

7 

8 

63 

65.45 

11 

7 

13 

14 

10 

9 

10 

6 

9 

8 

97 

66.45 

12 

12 

11 

13 

17 

11 

19 

12 

18 

12 

137 

67.45 

13 

23 

17 

22 

13 

10 

14 

14 

13 

16 

155 

68.45 

12 

15 

20 

15 

20 

16 

22 

26 

17 

17 

180 

69.45 

14 

8 

4 

9 

11 

13 

12 

15 

14 

22 

122 

70.45 

9 

8 

5 

4 

11 

5 

4 

6 

10 

6 

68 

71.45 

7 

8 

8 

6 

4 

7 

9 

7 

3 

6 

65 

72.45 

2 

2 

3 

4 

2 

5 

1 

2 

4 

3 

28 

73.45 

1 

1 

2 

0 

2 

3 

1 

1 

2 

1 

14 

74.45 

1 

1 

3 

0 


1 

0 




6 

75.45 

0 

0 


0 


1 

1 




2 

76.45 

1 

1 


1 







3 

Total 

100 

100 

100 

100 

100 

100 

100 

100 

100 

100 

1000 

M 












a 













6. Find the standard deviation of the ten standard deviations of 
Table 26(a), Exercise 4. Which shows the greater dispersion, the sample 
means or the sample standard deviations? 

7. Find the standard deviation of the ten standard deviations of 
Table 26(b), Exercise 5. Which shows the greater dispersion, the sample 
means or the sample standard deviations? 

37. THE SIGNIFICANCE OF THE MEAN AND THE 
STANDARD DEVIATION 

Thus far our statistical analysis of a given group has enabled us to 
abstract certain qualities of the group. The most important of these 
qualities are: (1) the central or typical condition of the group, and 



142 MEASUREMENT OF DISPERSION 

(2) the degree of variability of the members of the group. The 
central condition can be obtained from the appropriate measure of 
central tendency, and the degree of variability from the appropriate 
measure of dispersion, preferably the standard deviation and the 
coefficient of variation. For example, important facts of the dis- 
tribution of college algebra marks are: 

M = 74.48 c.u. a = 9.16 c.u. 

These summarizing constants contain the kernel of the distribution. 
They give a fairly complete numerical description of the sample. 

Now what statistical judgments concerning the parent population 
— which consists of all the grades in college algebra that are recorded 
at Bucknell University — can one form from the examination of the 
sample? 1 We do not desire to answer this question completely at 
this point, as Chapter 13 is devoted entirely to the problem that we 
raise here, but we may appropriately state certain facts, which must 
at this time be accepted without proof. 

If the sample that we have been considering was a random one — 
that is, if any mark in college algebra had the same chance of being 
selected a member of our sample as any other mark — we may 
expect that if another sample were selected its mean and its standard 
deviation would differ but slightly from those we have computed. 
Furthermore, we may expect the true mean and the true standard 
deviation of the parent population to differ but little from those of 
the sample. 

It has become customary, therefore, for statisticians who desire to 
make statistical estimates of the 'parent population from an analysis of a 
sample to record the results in such a manner that a definite range of 
variation about the estimated measure is determined. The limits 
of the definite range of variation about an estimated value are 
established in such a way that we can state the probability that the 
known value (mean, standard deviation, etc.) found from the sample 
does not differ more than a determinate amount from the unknown 
and generally unknowable true values of similar constants of the 
parent population or universe. This determinate amount is generally 
a standard deviation or a probable error. The computed value de- 

1 As a matter of fact 125 measurements constitute far too small a sample for 
purposes of generalization. We use it here merely for illustrative purposes. 



SIGNIFICANCE OF M AND a 


143 


rived from the sample is known, whereas the estimated value be- 
longing to the universe is generally not known. The computed value 
is the basis of our estimate and our task is to measure the reliability 
of the estimate. This measure of reliability is expressed in terms 
of chance or probability. Our method is to find the determinate 
amount and state the probability that the known value diverges 
this amount from the true value. 

As an example, what can we say about the mean of the universe, 
M u , of college algebra marks from our analysis of the sample? We 
must first compute the determinate amount, the probable error of 
the mean, Em, then interpret the result, where 

p __ .6745 (Standard deviation of sample) 

V Number in the sample 

or, in brief, 

E m = .6745 = 

VN 

For the problem under discussion 

Em = .6745 = 0.55 c.u. 

V 125 

and, as is customary, we write 

M u " Mean of universe = 74.48 dt 0.55 c.u. 

which, translated into English, reads “ 74.48 with a probable error of 
0.55.” This means that the chances are even that the sample mean, 
74.48 c.u., does not differ more than 0.55 c.u. from the true mean, 
M u . Note that we do not say, “The chances are even that the true 
mean M u does not differ more than 0.55 from the sample mean 74.48.” 
M u is a fixed value, it is not a variable as the quotation implies. 
The sample means, however, are variable. This is an important dis- 
tinction. 

Doubtless, the two preceding paragraphs look formidable. Sup- 
pose we now try to make understandable what we have said. Our 
first sample of 125 scores chosen at random gave a mean 74.48 c.u. 
and a standard deviation of 9.16 c.u. Another sample of 125 scores 
chosen in a similar manner would probably yield slightly different 
results. In other words, these so-called statistical constants show 
variation as we move from sample to sample. 



144 


MEASUREMENT OF DISPERSION 


We continue this sampling process until we have a large number 
of sample means, sample standard deviations, etc. This large num- 
ber of sample means may be formed into a distribution of means . 
The distribution of means has its mean, Mm) its standard deviation, 
the standard deviation 1 of the means, <tm) and its probable error, 
the probable error of the means, Em. 


This distribution of means has some remarkable properties: 


1 . 

2 . 

3. 


It is a normal distribution. 

Its mean, M M) is equal to the mean of the universe, i.e. M m = M u . 
Its standard deviation is given by cr M = —=• 


4. Its probable error is given by E M = .6745a M = .6745 -7=1 • 

viV 

5. Two thirds of the sample means are included in the interval M m ± <j m 
or in M u ± cm. 

6. One half of the sample means are included in the interval M m ± E M 
or in M u ± E M . Thus, the probable error of the mean , E M , is a value 
such that the chances are even that a sample mean lies within the 
interval, M u ± E Mi or outside the interval. 

7. Practically all the sample means are included in the interval 
Mm zk 3 (Tm or M u ± 3 ctm* 


The student will observe from the third property that, even for 
a reasonably large sample, the distribution of means is rather con- 
centrated. Thus for N = 100, <Jm = cr/10, and db 3 <Tm = ± 3a/l0, 
which is a relatively small range. So if N is large, a sample mean M 
is an excellent estimate of M u . The little variation in the distribu- 
tion of means shows that the mean is a stable measure of central 
tendency. Its stability is illustrated by the rather narrow normal 
curve of Figure 15. 

Similarly, the sample standard deviations may be formed into 
a distribution of standard deviations. While this distribution is 
not exactly normal for large values of N , if the samples are taken 
from a normal universe it does not differ a great deal from normality. 
It has its mean, M ay its standard 2 deviation ov, and its probable 
error, E a . 


1 The standard deviation of the mean is frequently called the standard error 
of the mean . 

* The standard deviation of <r is frequently called the standard error of a. 



SIGNIFICANCE OF M AND <r 


145 


Figure 15 



The sample standard deviations are distributed almost sym- 
metrically about May which is approximately equal to the standard 
deviation of the universe a u , and with a measurable variation. 
We can measure the variation of the sample ex’s by da or by E v . 
Formulas for evaluating them are the following: 

d a = !- — = —7= • —7= = .707 d m 

V2N V2 VN 



146 


MEASUREMENT OF DISPERSION 


E a = .6745 <r<r = .6745—4= = .707 E M = .47690^ 

V2W 

The probable error of the standard deviation, E a , is a value such 
that the chances are even that a sample a will lie within the interval 
<r u ± E a , or outside the interval. As an example illustrating its use, 
what can we say about the standard deviation of the universe, <r„, 
of college algebra marks from our analysis of the sample? We find 

Ea = .6745 = .39 c.u. 

V250 

or from 

Ea = .707 Em = .707(.55) = .39 c.u. 
and, as is customary, we write 


<Tu = 9.16 dt 0.39 c.u. 


and which we read “9.16 with a probable error of 0.39.” This means 
that the chances are even that the sample a, 9.16 c.u., does not 
differ more than 0.39 c.u. from the true standard deviation, a u . 

If the student would prefer to use a “ two-to-one-chance ” lam 
guage, he may do so by using as the measures of variation cf m and 
a a. This is quite a matter of taste and about tastes we do not 
wish to argue. 

As an illustrative example, again we consider the distribution of 
college algebra marks. We have 


(Tm ~ 


a 

VN 


9.16 

VT25 


0.82 c.u. 


•’ - ^ - 707 ^ - (-707) (.82) - 0.57 c.u. 


Thus, the odds are two to one that a sample mean will not differ 
more than 0.82 c.u. from the mean of the universe, M u . Or, about 
two thirds of all sample means are included in the interval M u ± a m- 
Similarly, the odds are two to one that a sample standard deviation 
will not differ more than 0.57 c.u. from the standard deviation oi the 
universe, cr w . That is, about two thirds of the sample standard 
deviations are included in the interval <r u ± 0V. 



SIGNIFICANCE OF M AND a 


147 


EXERCISES 

1. Compute <t and M.D. from M for the distribution of Exercise 2, 
page 54 , 

2. Compute <r for the distribution of Exercise 1, page 41. How many 
measures are included by the interval M ± a? Does M db Scr include the 
entire group? 

3. Find <r for the theoretical distribution of Exercise 2, page 42. How 
many measures are included by the interval M dt 2<r? 

4. Consider the distributions of heights and weights given in Exercise 1, 
page 54. Which distribution has the greater dispersion? 

6. Compute c t and V a for the distributions (a) and (b) of scores in 
English found in Exercise 4, page 102. 

6. Find a and V a , for the distributions of the measurements of eggs 
found in Exercise 15, page 105. Compare these results with those obtained 
in Exercise 5. 

7 . a. Show that the standard deviation of the first N integers is given 
by the equation: 


b. Find a for the first N odd integers. 


8. If N i, M i, and v\ are the frequency, mean, and standard deviation 
for one group of measures and A r 2 , A/ 2 , and <r 2 for a second group, show that 
the standard deviation of the group formed by combining the two groups 
is given by: 


where 


N\(Ji + A 202 

N + 


Ni N 2 
N 2 


(Mi - M t ) 1 


N = Ni + N 2 — the total frequency of the combined groups, and 
a = the standard deviation of the combined groups 


Hint: See Exercise 7 on page 74. 

9. Apply the result of the preceding exercise to find the a for the dis- 
tribution given in Exercise 8, page 103. 

10. At a university 1,000 students were given an objective test. The 
distribution of marks was closely normal. The analysis gave M *= 72, 
a **= 8. What were the approximate values of Q , Q h Q$, M.D . from M , 
M 0 ? Find Ex, E M , E a , and interpret them. 

11. Compute Em and E a for the distributions of Exercise 1, page 54. 
Interpret them. 

12. Compute E M and E<r for the distribution of Exercise 2, page 54 
Interpret these values. 



148 


MEASUREMENT OF DISPERSION 


13 . In a paper, “Experiment and Statistics in the Selection of Em- 
ployees,” in the Journal of American Statistical Association , March 1923, 
p. 605, Mr. Harry A. Wembridge has presented data that show the points 
scored on a mental test by 290 prospective employees and the per cent, 
of standard production attained by these same 290 persons after being 
employed. 


The results are: 

Scores on Test 
N = 290 
M x = 42.33 
a i = 9.25 


Per cent Production 
N = 290 
M 2 = 92.02 
(j 2 = 24.47 


Compare the variability in mental ability with that of productive ability. 
14 . Find Em for the data in Number 13 above and interpret the results. 

16 . The analysis of an approximately normal distribution of weekly 
salaries of 300 men gave: M = $60.00 and o = $10. 

(1) About how many received salaries between $50 and $70? 

(2) About how many received salaries between $40 and $80? 

(3) Approximately, what were the largest and the smallest salaries 
received? 

16 . The analysis of two approximately normal distributions of the 
weekly salaries of 300 men each gave: 


1st distribution 

Mi = $35.00 
M d » $34.00 
oi = $ 7.00 


2nd distribution 

M 2 = $60.00 
M d = $58.00 
( 7 2 = $10.00 


Relatively, which distribution shows the greater dispersion? 

17 . Distributions of the heights and weights of 1 ,500 college men were 
analyzed with the following results: 


Heights Weights 

N = 1500 N = 1500 

Mi = 67.5 inches M 2 = 135.4 pounds 

cr i = 2.5 inches o 2 = 15.2 pounds 

Which distribution shows the greater dispersion? 

18 . From the statistical summaries given in Number 17, assuming the 
distributions were approximately normal, what are some conclusions that 
may safely be drawn? 

19 . Prove: M AX = AM X . Illustrate. 

20 . Prove: M AX + B = AM X + B . Illustrate. 

21 . Prove: = Ao*. Illustrate. 

22 . Prove: o AX + B = Ao x . Illustrate. 



SIGNIFICANCE OF M AND a 


149 


23. Prove: 2 (y — ax) = 0. 

24 . Prove: 2(y — as) 2 = N(a 2 ax + cry) — 2a2 xy. 

25. Prove: 2X • 2F does not equal 2 XY. 

26. Prove: 2(F — mX — b) - N(My — mMx — b). 

27. Supposing that the frequencies of the X-values are the terms of the 
binomial expansion ( q + p) n as indicated in the table, find ax if ( q + p) 
= 1. Hint: complete the table as shown and recall that 2X/(x) = np. 
[See Exercise 42, p. 110.] 


X 

/(*) 

X(X - 1 )f(x) 

0 

Q n 


1 

nq n ~ l p 


2 

i 

i 

n(n — 1) 

-~2 — - q n ~ 2 p 2 


i 

i 

n 

pn 



28. On the X-axis are N fixed points Xi, X 2 , . . . , X x and an unknown 

N 

point X. Find X so that 2(X, — X) 2 is a minimum. Compare with 

t=i 

theorem on page 131. 

29. In the result of Exercise 27 above, substitute n = 10, p = q = 1/2, 
and find a. Compare your result with that found in Exercise 3 of this set. 



Chapter 5 

SKEWNESS: EXCESS: MOMENTS 
38. INTRODUCTION 

The two preceding chapters have been concerned with char- 
acterizing masses of numerical data by means of certain summarizing 
numbers. These summarizing numbers have in general been well- 
defined statistical constants that were designed to measure central 
tendency and dispersion. With the computation of these constants 
the distribution has been partially characterized and described. 

When we say of a distribution of heights, for example, that it shows 
a mean of 67.5 inches and a standard deviation of 2.5 inches, we know 
that approximately two-thirds of the total frequency is found within 
the interval 65 to 70; that it is extremely unlikely that any member 
of the distribution will be found without the limits 67.5 ± 3(2.5); 
that the total range is about 6(2.5) inches. If other summarizing 
numbers such as Q i, Q 3 , M 0f etc. are given, then our knowledge of the 
distribution is considerably enlarged. The main purpose of these 
summarizing numbers is to assist us in comprehending the important 
features of a distribution though the distribution may not be present 
before us. 


39. THE MEANING OF SKEWNESS 

Our confidence in the conclusions mentioned in the preceding sec- 
tion is especially strengthened by the knowledge that the distributions 
of heights of men chosen at random are fairly symmetrical. However, 
we can conceive of a city police force constituted of men at least 
65 inches in height, that the symmetry of such a selected group 
would be greatly disturbed by the selectivity, and that the range 
of values greater than the mean would be longer than the range of 
values less than the mean. This characteristic feature of lack of 
symmetry in distributions is usually called skewness or asymmetry. 

In the preceding chapter emphasis was placed upon the fact that 

150 



THE MEANING OF SKEWNESS 


151 


dispersion is concerned with the amount of the variation rather than 
with its direction. We feel the need for a statistical constant which 
will summarize the direction of the variation or the departure from 
symmetry. And just as we found it advisable to measure dispersion 
for purposes of comparison by measures of relative variability , so for 
purposes of comparison we must invent measures of relative skewness. 
Owing to the fact that skewness is dependent upon the amount of 
dispersion, the coefficients of relative skewness are obtained by di- 
viding the ablolute skewness by some measure of absolute dispersion. 
This method will result in ratios or abstract numbers which are 
independent of the units in which the original variates are measured. 


40. THE MEASUREMENT OF SKEWNESS 

It is an obvious fact that in unimodal symmetrical distributions 
the mean, the median, and the mode coincide. Also in symmetrical 
distributions the numerical distances from the median to the lower 
and upper quartiles are equal, and certain pairs of deciles are equi- 
distant from the median. As the distribution departs from sym- 
metry there is a separation of the three measures of central tendency, 
the difference between the mean and the mode being greatest. Also 
skewness is indicated when the distances from the median to the 
quartiles become unequal, and when pairs of deciles are not equi- 
distant from the median. Evidently any of these differences can 
be made the bases for measurements of skewness. 

Since the mean and the median are pulled away from the mode 
in the direction of the skew, or the tail of the curve representing the 
extreme measures, an evident measure of absolute skewness could 
be taken to be M — M a . Professor Karl Pearson has used this as 
the basis for his formula for relative skewness, namely: 


Sk = 



(i) 


If the mean is to the right of the mode, 1 that is if M > M 0 , as in 
curve A, the skewness is positive, whereas if the mean is to the left 
of the mode, that is, M < M 0) as in curve (7, the skewnoss is negative. 


1 See Figures 16 and 17. 



152 SKEWNESS: EXCESS: MOMENTS 

If the mean and the mode coincide, as in curves B and D 1 the skew- 
ness is zero. 

The formula (1), known as Pearson’s formula, is open to the objec- 
tion that in many distributions there is no well-defined mode. Since 
in many distributions the approximate relation 

M - M 0 = 3 (M - M d ) 

has been found to obtain, this relation suggests the use of the alter- 
native Pearson form: A 



Curve A shows positive skewness while curve B is symmetrical. 


Since in measuring skewness we are interested in the degree of 
asymmetry a coefficient of skewness is always an index that may be 
used to compare the unsymmetrical distribution with a symmetrical 
one that we superimpose. Thus, in Figure 16 we may consider A 
as the frequency curve for a given distribution and B as the sym- 
metrical curve that is drawn to display the skewness in A. We 
indicate on the figure that the area bounded by the curves A and B 
and the X-axis causes the skewness in A. We can make a similar 
statement about the curves shown in Figure 17. 



THE MEANING OF SKEWNESS 


153 


Figure 17 



Curve C shows negative skewness while curve D is symmetrical. 


Let us bo more specific and consider the four distributions of 
Table 27. The student should verify the statistical constants per- 
taining to each distribution. We have drawn histograms of these 
distributions (see page 155) and on them we have located the 'points 
which mark the position of the arithmetic mean and the median, 
and the distance which indicates the value of the standard deviation. 
The coefficients of skewness, since they are pure numbers or indexes, 
cannot of course be shown on the graphs. 

The reader should be warned that coefficients of skewness like 
all relative numbers may not mean much until he has had a con- 
siderable experience with many and varied distributions. Only by 
drawing the histograms, marking on them the points for M and M dy 
and the distance for or, then computing Sk by any of our formulas 
and comparing the results for several distributions — the more the 
better — mil these values take on a real meaning. 

It has been shown 1 that (M — M d ) / & lies between — 1 and + 1, 
and thus skewness computed by formula (2) is always between — 3 
and + 3. This measure of skewness is obviously quite sensitive. 
While it is dangerous to set limits on such indexes, we may say, as 

1 Harold Hotelling and Leonard M. Solomons, Annals of Mathematical Sta- 
tistics, May 1932, pp. 141-142. 




154 


SKEWNESS: EXCESS: MOMENTS 


Table 27 


Class 

X 

A 

m 

B 

m 

C 

/(*) 

D 

m 

87.5-92.5 

90 

0 

4 

0 

4 

82.5-87.5 

85 

12 

4 

4 

8 

77.5-82.5 

80 

24 

20 

40 

20 

72.5-77.5 

75 

28 

44 

24 

24 

67.5-72.5 

70 

24 

20 

20 

40 

62.5-67.5 

65 

12 

4 

8 

4 

57.5-62.5 

60 

0 

4 

4 

0 

N 


100 

100 

100 

100 

M 


75 

75 

75 

75 

M d 


75 

75 

76.25 

73.75 

c 


6 

6 

6 

6 

Sk by (2) 


0 

0 

- 0.625 

+ 0.625 

«3 


0 

0 

- 1.2 

+ 1.2 


a rough measuring stick, numerical values of skewness computed 
by (2) less than 0.25 may be considered small, numerical values 
between 0.25 and 0.5 as moderate, and numerical values greater 
than 0.5 as large. Numerical values as large as 1 are unusual. 

Exercise, a. For each of the following distributions compute M, 
M d , or, and S*. 

b. Draw the histogram and indicate upon it the points M, Md, 
and the distance a. 


X 

A 

m 

B 

fix) 

c 

fix) 

D 

fix) 

35 

4 

16 

2 

1 

30 

12 

48 

4 

2 

25 

20 

12 

8 

3 

20 

28 

10 

10 

4 

15 

20 

8 

12 

15 

10 

12 

4 

48 

25 

5 

4 

2 

16 

50 

Total 

100 

100 i 

100 

100 




57.5 


92.5 X 








156 


SKEWNESS: EXCESS: MOMENTS 


A third measure of skewness that has become well known is that 
due to Bowley. It is based upon the fact previously mentioned that 
in an asymmetrical distribution the numerical distances from the 
median to the lower and the upper quar tiles are unequal. If qi and 
g 2 are defined by the equations 


then: 


qi = M d - Q i and q 2 = Q 3 — M d 

qb __ ffg ~ _ Qs + Qi — 2 Md 

* ft + ?i ~ Qs - Qi 


( 3 ) 


In regard to formula (3), Bowley says: 

If the curve is symmetrical, q 2 = q\, and Sk = 0; if q 2 > q\ , Sk is posi- 
tive, and if q 2 < qi , Sk is negative. Sk becomes + 1 if qi = 0, that is, if 
the median and lower quartile coincide; and Sk becomes — 1 if q 2 = 0. 
Sk is therefore a measurement which never exceeds 1 numerically, and has 
a definite significance at zero and at its extreme values. . . . The signif- 
icance of the various values can only be obtained by experience, but it may 
be suggested that 0.1 is a moderate degree of skewness, and 0.3 a consider- 
able degree. 1 

The quartile measure of skewness is rigidly defined, is simple to 
compute, and is easily understood. It is a pure number, and the re- 
striction of its value to the small interval from — 1 to + 1 leaves it 
sufficiently sensitive for many needs. A just criticism is that it 
fails to take into consideration the size of the extreme variations. 
Since the main question in skewness is the determination of how 
much more the items deviate on one side of the mean than on the 
other, the ideal measure of skewness should give due emphasis to 
the extreme variations. 

Many of the objections to the previously mentioned methods for 
measuring skewness may be met by returning to a consideration of 
the deviations of the variates from their mean. Since we are in- 
terested in how the variates are situated with respect to the mean 
and since we wish to give emphasis to the extreme measures, we re- 
quire some function of the form 

2 x n f(x) 

for some value of n. Now if n is even, we obtain the amount and 
not the direction of the variation. In order to secure the direction of 


1 Bowley, op. cit. f p. 116. 



THE MEASUREMENT OF SKEWNESS 


157 


the variation, we are compelled to use odd numbers for n. If n = 1, 
2z n = 0. If n = 3, we obtain 2 x z f(x), a basic factor in our next 
measure for skewness known as a 3 (read: alpha three). a 3 is defined 
as the third moment of the distribution about the mean divided by 
the cube of the standard deviation, or by the equation: 


that is 


2x 3 /(x) 

— — the third moment about M 

a * (7 3 cube of the standard deviation 

_ vz _ nu three 
az ~ or 3 “ sigma cube 


( 4 ) 


In what follows we shall consider a 3 as the preferable measure of 
skewness. 1 

As an illustrative problem we shall compute a 3 for the distribution 
of grades in college algebra. The table will be a continuation of 
Table 23, page 126. 


Table 28. Computing <23 for the Distribution of Grades 
in College Algebra by the Definition M = 74.48 


X 

fix) 

X 

x 2 f(x) 

a4/(x) 

95 

4 

20.52 

1,684.2816 

34,561.458432 

90 

6 

15.52 

1,445.2224 

22,429.851648 

85 

12 

10.52 

1,328.0448 

13,971.031296 

80 

19 

5.52 

578.9376 

3,195.735552 

75 

37 

0.52 

10.0048 

5.202496 

70 

24 

- 4.48 

481.6896 

- 2,157.969408 

65 

11 

- 9.48 

988.5744 

- 9,371.685312 

60 

6 

- 14.48 

1,258.0224 

- 18,216.164352 

55 

4 

- 19.48 

1,517.8816 

- 29,568.333568 

50 

2 

- 24.48 

1,198.5408 

- 29,340.278784 

Total 

125 


10,491.2000 

- 14,491.152000 


a 2 = Vi = = 83.9296 (c.u.) s 

cr = 9.16 c.u. 

<r 3 = 768.795136 (c.u.) 8 


1 a> is zero for the normal curve. See page 405. 



158 


SKEWNESS: EXCESS: MOMENT 


- 115.929216 (c.u.)* 

0.1507 

a 3 is a very refined measure of skewness. The process of cubing 
maintains the proper signs for the deviations and also gives emphasis 
to the extreme variates. Further, the division by <7 3 reduces the 
measure to an abstract number. Hence it is a coefficient of relative 
skewness and is independent of the unit of measure. Since it is not 
restricted in its range, it is a very sensitive measure, the sensitiveness 
being emphasized by the cubing of the deviations. Its chief dis- 
advantage is the apparent labor of computing it. We shall greatly 
overcome this apparent trouble in Section 44 (p. 164) by developing 
a “ short method.” 


2 x*f(x) __ - 14491.152 
V *~ N 125 

Vi _ - 115.929216 _ 
“* a 3 768.795136 


41. EXCESS OR KURTOSIS 

In elementary statistics a distribution is usually satisfactorily 
characterized by the measures of central tendency, the measures of 
dispersion, and the measures of skewness, or more briefly, by M , 
<r, and a 3 . We may add one other important constant to the sum- 
marized description by considering the relative number of the 
variates in the immediate neighborhood of the mean or mode. This 
measure of relative flatness (or peakedness) of a curve fitted to the 
distribution as compared with that of the normal curve fitted to the 
same distribution is called a measure of excess or kurtosis. 

The excess or kurtosis is measured by : 

ZxVQ C) 

K= a <-3 = -£- -3 = ^-3 (5) 

Again the normal curve is our standard for comparison. Since 
a* = 3 for the normal curve (see page 405) the excess for any other 
curve is merely a comparison of its a 4 with that of the normal curve 
which has the same standard deviation. 

If the excess is positive (leptokurtic), the number of variates near 
the mean is greater than in a normal distribution. If the excess is 
negative (platykurtic), the curve is more flat-topped than the 



EXCESS OR KURTOSIS 159 

corresponding normal frequency curve. The normal curve, in which 
a 4 =* 3, is said to be mesokurtic. 

Figure 19, exhibiting three curves with the same mean and the 
same standard deviation, illustrates graphically the meaning of excess. 


Figure 19 



Curve A is platykurtic and a A — 3 < 0; curve B is mesokurtic 
(normal), and a A — 3 = 0; and curve C is leptokurtic and a 4 — 3 > 0. 

42. THE UNADJUSTED MOMENTS OF A DISTRIBUTION 

In the preceding chapters we have several times mentioned the 
term moment. It is a concept so important in statistical analysis 
that we cannot longer defer its more complete consideration. We 
shall soon learn that a statistical distribution — which we have 
characterized by its mean, its standard deviation, its skewness, its 
excess — is, in brief, characterized by its moments. 

Further, the notion of moments serves as a guide in curve-fitting. 
It was remarked in Section 16 (p. 38) that the total area under a 



160 


SKEWNESS: EXCESS: MOMENT 


frequency curve should equal the area of the histogram, which is 
another way of saying that the total frequency should be unchanged. 
As the total frequency is the zeroth moment, this is equivalent to 
requiring that the zeroth moment of the frequency curve equal the 
zeroth moment of the given distribution. In like manner, we may 
require that a sufficient number of successive moments of higher 
orders of the frequency curve be equal to the corresponding higher 
moments of the given distribution. This is the so-called principle 
of moments 1 for the determination of the parameters in a curve 
which is to be selected to represent a given distribution. We shall 
have an opportunity to observe an application of the principle of 
moments in Chapters 7, 12, and 13. 

The moments of a distribution can be computed about any point 
at pleasure. They can be expressed in various units. The most 
significant moments are referred to the mean, il/, and are usually 
expressed either in the given or the class unit. They may then be 
reduced to abstract numbers by dividing by the appropriate powers 
of c as was done, for example, in defining a 3 and a\. As illustra- 
tions, we have learned that if x equals the deviation of any frequency 
from M expressed in the given unit, then : 


„ 1 = 2xf(x)_ 
Vl N 
„ _ 2x 2 /(x) 
2 “ —~N~ 
... _ 2x 3 /(x) 

Vz ~ ~TT~ 
_ 2 x*f(x) 
4 ~ N 


the 1st moment of the distribution about 
M expressed in the given unit = 0 
the 2nd moment of the distribution about 
M expressed in the (given unit) 2 = o- 
the 3rd moment of the distribution about 
M expressed in the (given unit) 3 
the 4th moment of the distribution about 
M expressed in the (given unit) 4 


etc. 

Hence in general we define: 


Sx n /(x) _ the nth moment of the distribution about 
N ~ M expressed in the (given unit)" 


( 6 ) 


If n = 0, we have: 


„ _ 2xo/(x) 
Vo ~~N~ 


= 1 


1 Rietz and others, op. cit., p. 68. 



THE UNADJUSTED MOMENTS 


161 


In our previous discussion, not only have we encountered moments 
about M but about other points as well. Formula (7), page 9, in- 
volves, for example: 

j — the 1st moment about zero = M 
N 

and 

~ = the 2nd moment about zero. 

N 


The higher moments about zero are similarly defined. 

In computing the standard deviation we noted that the arithmetical 
computations were simplified by referring the variates to some point 
near the mean and expressing them in class units (see Table 25, 
p. 130). We shall soon discover that this transformation to class 
units is especially useful when computing the higher moments. 
In computing a* (Table 28, p. 157) we felt a need for some short 
method. 

On pages 72 and 130 we have noted that if x' equals the deviation 
of any frequency from the assumed origin ()'{h, 0) expressed in the 
class unit, then: 

, __ 2s7Qc) __ the 1st moment of the distribution about 

Pl ~~ N ~ O' expressed in the class unit = b x 

, 2 x' 2 f(x) _ the 2nd moment of the distribution about 

v% ~ N ~ O' expressed in the (class unit) 2 

Hence in general we may define: 

, _ 2s'7(x) _ the nth moment of the distribution about 
Vn “ N “ O' expressed in the (class unit)" 

If w - the class width, x ' class units = wx' given units, and 
hence : 



, _ X(wx') n f{x) _ w n ^x' n f(x) __ the nth moment about O' expressed 
n N N in the (given unit) n 


Therefore we have the theorem: The nlh moment of a given distribu- 
tion about any point O' in the nth power of the given unit equals uf 1 
times its nth moment about O' in the nth power of the class unit. Or 
in short: 


v' n in the (given unit)” = w n v' n in the (class unit) n 


(8) 



162 


SKEWNESS: EXCESS: MOMENT 


While the most significant moments are those computed about the 
mean, yet their computation directly from the definition, is very 
tedious, owing to the fact that M usually involves several decimals, 
and hence x = X M 

also involves several decimals. Raising these decimals to the third, 
fourth, and higher powers is laborious, even with the aid of a cab 
culating machine. However, just as we were able to avoid this 
tedium with the short method for computing cr (Table 25, p. 130), 
so we shall avoid it in computing the higher moments. 

From Figure 1 (p. 73) we have noted that: 

x = wx' — wb x = w(x' — b x ) expressed in the given unit 
Hence: 

Xxfjx) _ . 2w(x’ - b x )f(x) w2x’f(x) wbx'Lfix) 

1 ~ N ~ N N N 

__ 2 x*f(x) _ 2w*(x' - b x )*f(x ) 2u> a [x' a - 2 b x x' + & a ]/(x) 

V% ~ N N ~ N 



In like manner, it follows that: 

Vi = ufiiv'i - 3 pj b x + 2 bl) 

n = w\vi - 4 v' z b x + 6^a - 3 b\) 

etc. 

These moments described in (9) of course express the v’s in given 
units. If the class interval is taken as the unit, which is usually the 
case, w = 1, and then the moments are expressed in class units. 
If h = 0, v‘ n becomes the nth moment about zero as origin. 

We have found it desirable to express M and a in terms of the 
given unit. However the third, fourth, and higher moments are 
usually expressed as ratios in such a manner that they are independent 
of the unit of measure. This was accomplished in defining a 3 and 
<*4 by dividing v* and v* by a* and a 4 respectively. 1 Thus: 

1 is the nth moment about M expressed in ( standard units) n . A 

variate is expressed in standard units by dividing its deviation from M by cr. It 
Is usually indicated by t. Thus, U = — = ~ • 



THE UNADJUSTED MOMENTS 


163 


= v s = 2< 3 /(x) 
<r 3 iV 


va _ 2 < 4 /(x) 

r4 


iV 


and in general 


„ = v» = 2>/(x) 
" <r l iV 


In particular we note that: 

oti = 0 and a 2 = 1 

The moments 

and nl - 


= 


JV 


AT 


are frequently called the crude or unadjusted moments about the mean 
and assumed point respectively. The standard deviation, the skew- 
ness, and the excess, if based upon them, are called the unadjusted 
standard deviation, the unadjusted skewness, etc. 


13. THE ADJUSTED MOMENTS: SHEPPARD'S 
CORRECTIONS 

In arranging our data in a frequency distribution we have assumed 
that the items in a given class were concentrated at its mid-point. 
This procedure introduces a slight error, which we call a grouping 
error . By a process too abstruse for consideration here, certain 
corrections — known as Sheppard’s Corrections — have been devised 
to assist in correcting the errors in the moments due to grouping. 
When applied to the crude moments they give the adjusted moments 
of the distribution. It is quite customary to denote the adjusted 
moments by p n (read: mu enn), n = 1, 2, 3, etc. They find their 
widest application in fitting a frequency function to a distribution 
of observed measurements by what is known as the method of 
moments. 

The adjusted moments are not generally recommended for use in 
unrefined statistical analysis. Especially is this true if the original 
data were not taken with sufficient accuracy to warrant our using the 
niceties of analysis that are implied in the corrections, for certainly 
we should not adopt methods in computation that are inconsistent 
with the data at hand. A more potent reason for our failure to 
recommend their employment generally is due to the fact that an 



164 


SKEWNESS: EXCESS: MOMENTS 


intelligent use of them requires a knowledge of their development. 1 
Finished statisticians use them with care and discrimination. We 
do not wish to discourage their use when the data warrant it and 
when they can be employed with safety and confidence, but we do 
insist that they should be used with understanding. We mention 
them here to add completeness to our text, to illustrate the method 
of computing them to the student who may continue the study of 
statistical analysis beyond this introductory text, and to caution 
the reader against their indiscriminate use. 

The adjusted moments involving Sheppard’s Corrections are 
given by the following equations: 


up 

" 12 

Ms = Vz 

vz w 2 luP 
m = g f - 240 


Sheppard’s Corrections if moments 
are expressed in the given unit 


( 10 ) 


where w is the class interval. 

If the moments are expressed in the class unit, w = 1, and the 
simplifications are evident. 

The refined or adjusted formulas for the standard deviation, the 
skewness, and the excess are given by: 

<r = a 3 = £j, « 4 — 3 = ^ — 3 


No corrections are applied to the moments of theoretical distributions 
and curves. In such cases we indicate the nth moment about M by m« and 
about any other point by fin. 


44. COMPUTATION OF THE MOMENTS 
The order of procedure when computing the moments should be : 

1. Choose a convenient arbitrary origin, and compute v[, v' 2 , v 3 , v\. 

2. Transfer the moments to the mean by means of equations (9), and 
thus compute v h v 2 , *> 3 , v 4 . See that the proper units are included. 

3. If Sheppard's Corrections are to be applied, use equations (10) and 
compute mi, M 2 , M3, M4. 

We shall illustrate this procedure by computing the moments of 
the following distribution: 

1 Rietz and others, op. cit., pp. 92 ei seq. 



COMPUTATION OF THE MOMENTS 


165 


Table 29. Frequency Distribution of Pulse Beats per Minute 
in English Convicts 1 


X 

fix) 


* 7 (*) 

xV(x) 

*'*/(*) 

*'*/(*) 

46.5 

2 

- 8 

- 16 

128 

- 1,024 

8,192 

50.5 

5 

- 7 

- 35 

245 

- 1,715 

12,005 

54.5 

17 

- 6 

- 102 

612 

- 3,672 

22,032 

58.5 

57 

- 5 

- 285 

1,425 

- 7,125 

35,625 

62.5 

90 

- 4 

- 360 

1,440 

- 5,760 

23,040 

66.5 

150 

- 3 

- 450 

1,350 

- 4,050 

12,150 

70.5 

120 

- 2 

- 240 

480 

- 960 

1,920 

74.5 

131 

- 1 

- 131 

131 

- 131 

131 

78.5 

109 

0 

000 

000 

000 

000 

82.5 

86 

1 

86 

86 

86 

86 

86.5 

62 

2 

124 

248 

496 

992 

90.5 

42 

3 

126 

378 

1,134 

3,402 

94.5 

15 

4 

60 

240 

960 

3,840 

98.5 

18 

5 

90 

450 

2,250 

11,250 

102.5 

9 

6 

54 

324 

1,944 

11,664 

106.5 

5 

7 

35 

245 

1,715 

12,005 

110.5 

3 

8 

24 

192 

1,536 

12,288 

114.5 

3 

9 

27 

243 

2,187 

19,683 

Total 

924 


- 993 

8,217 

- 12,129 

190,305 


Choosing h = 78.5, we have for the 

v'l = = b x = = - 1.0746753 


v„ = 


Vt = 


924 
- 12129 


2x' 2 /(x) _ 8217 
N~ 

Sx' 3 /(x) 

Sx'y(x) 


= 8.8928571 

- 13.12662338 


924 

—- 3 , 0 - = 205.9577923 
924 


Using equations (9) we shall now express the v’a in the given unit. 
We have, noting that w = 4 : 

Fl = 0 

- 16[8. 8928571 - (- 1.0746753) 2 ] 

- 16(8.8928571 - 1.1549270) = 16(7.73793014) 


1 The data are taken from Biometrika , Vol. 11. 



166 


SKEWNESS; EXCESS: MOMENT 


v 3 = 64[- 13.12662338 - 3(8.8928571)(- 1.0746753) 

+ 2(- 1.0746753) 3 ] 

= 64(— 13.12662338 + 28.67080162 - 2.48234304) 

= 64(13.06183520) 

Vi = 256[205.9577923 - 4(- 13. 12662338) (- 1.0746753) 

+ 6(8.8928571)(— 1.0746573) 2 - 3(- 1.0746753) 4 ] 
= 256(205,9577923 - 56.42753004 + 61.62360463 - 4.0015692) 
= 256(207.1522977) 


Assuming that Sheppard’s Corrections may be applied, we find the ;u’s: 
Mi = v\ = 0 

M2 = 16(7.73793014) - 16(.08333333) = 16(7.65459681) 
ix z ~v 3 = 64(13.06183519) 

* - 256£207. 152298 250(7.73793014) + 256 (.029 167)3 

= 256(203.312500) 


Hence we have; 


the unadjusted constants 

M = 78.5 + 4(- 1.074675) = 74.2 p.b. 
a = Wi = 4(2.7817135) = 11.12685 p.b. 


v, _ 64(13.06183520) 
<r’ 64(21.5247046) 


0.6068 


Vi _ 256(207.1522977) 
cr 4 ~ 256(59.875562) 


K = a 4 - 3 = 0.4597 


and the adjusted constants 
M = 74.2 p.b. 

a = Vm^ = 4(2.7667) = 11.0668 p.b. 

a 3 = = -3 = 0.6168 

<j 3 a 3 

M4 256(203.312500) 

" 4 a 4 256(58.592852) 

= 3.46992 

K ^ a t - 3 = 0.46992 



COMPUTATION OF THE MOMENTS 


167 


The student will note that the application of Sheppard’s Corrections 
here has affected the constants slightly. 

Assuming that the parent population is normal, the values of a 3 , u 
and a 4 ,u of the universe are usually written: 


a 3 , u = (the computed a 3 ) ± 0.6745 


a 4 ,ti = (the computed a 4 ) ± 0.6745 



These statements mean that the chances are even, or it is equally 
likely, that the computed values of a 3 and a 4 do not differ numerically 
more than the specified amounts from the true values, a 3 , u and a 4 , M . 
For the illustrative problem we are considering we have: 


M = 74.2 =fc 0.2456 
c = 11.0668 ± 0.1736 
a 3 = 0.6168 =b 0.0544 
a 4 = 3.46992 ± 0.1088 


(See Section 37, p. 142.) 


EXERCISES 

1. Using Bowley’s coefficient, formula (3), find the skewness for the dis- 
tribution of grades in college algebra as given in Table 8 (p. 26). 

2. Using Pearson’s formula (2), find the skewness for the distributions 
of heights and weights described in Exercise 1, page 54. 

3. Using Bowley’s coefficient, find the skewness for the distributions 
of heights and weights described in Exercise 1, page 54. 

4 . Find cr, <x 3 , and a 4 for the distributions of Exercise 4, page 102. 

6. Continue the analysis of Exercise 6 on page 147 by finding a 3 and a 4 
for the distributions described in Exercise 15, page 105. 

6. If the class interval is taken as a unit, i.e., if w = 1, show that: 

vi = v't - b], 

v 3 = v% — 3 v 2 b x — bl, 

v x — v\ — 4 v 3 b x — 6 vj)l — hi 

7 . Compute M , cr, a 3 , and a 4 for the distribution in Exercise 2, page 54. 

8 . Compute a 3 and a 4 for the data of Exercise 19 at the end of this 
chapter. 

9. Compute a 3 and 04 for the data of Exercise 24 at the end of this 
chapter. 



X68 


SKEWNESS: EXCESS: MOMENT 


10 . 11 . 

Compute M , M&, M 0) <r, a Z) Compute M, Af<*, M 0 , o', a 3 , 
and a 4 for this table of chest and a 4 for this table of heights, 
measurements. 


The Chest Measurements 
of 10,000 Men 


(Original measurements to the 
nearest inch) 1 


X 

/(x) 

33 

6 

34 

35 

35 

125 

36 

338 

37 

740 

38 

1,303 

39 

1,810 

40 

1,940 

41 

1,640 

42 

1,120 

43 

600 

44 

222 

45 

84 

46 

30 

47 

5 

48 

2 

Total 

10,000 


Distribution of Heights 
6,441 Colored Soldiers 


(Original measurements to the 
nearest centimeter) 2 


X 

/(x) 

148.5 

2 

150.5 

9 

152.5 

13 

154.5 

23 

156.5 

56 

158.5 

88 

160.5 

162 

162.5 

318 

164.5 

468 

166.5 

564 

168.5 

665 

170.5 

708 

172.5 

749 

174.5 

747 

176.5 

586 

178.5 

469 

180.5 

314 

182.5 

207 

184.5 

133 

186.5 

70 

188.5 

38 

190.5 

22 

192.5 

15 

194.5 

10 

196.5 

3 

198.5 

2 

Total 

6,441 


1 The data are taken from E. T. Whittaker and George Robinson, The Cal- 
culus of Observations , 1924, p. 189. 

2 The data are taken from Annual Report of the Surgeon General, Medical 
Department of the United States Army, Vol. XV, Pt. I, p. 522. 



RETROSPECT AND PROSPECT 


169 


45. RETROSPECT AND PROSPECT 

We have now come to the end of our first important statistical 
problem, the elementary analysis of a simple frequency distribution. 
This analysis has been accomplished by computing certain statistical 
constants and making simple and concise statements about them. If 
a distribution is fairly symmetrical, the arithmetic mean and the 
standard deviation are usually sufficient to give a numerical de- 
scription. If it is skew, then a coefficient of skewness is included. 
If further refinement is desired, a coefficient of kurtosis is computed. 
Each computed parameter adds to our information about the dis- 
tribution in question. 

To proceed further into the analysis of a frequency distribution 
would take us into the study of frequency curves which, as we have 
previously stated, is beyond the scope of this text. While in a later 
chapter, we do consider the normal frequency curve (see Chapter 12), 
the study of skew frequency curves would take us too far afield. 
This is a topic to which the student trained in the calculus and ele- 
mentary statistical analysis may look forward. 

The second problem that we shall consider is the important 
problem of correlation. However, before we approach it, we shall 
deviate somewhat from our course and give a brief consideration to 
the application of averages to Index Numbers. 

MISCELLANEOUS QUESTIONS FOR REVIEW 

1. Find the sums: 

100 50 

(1) N (2X -h 5) (2) 2 (2X 2 - - 3A r + 6) 

X ~ 1 A' = 10 

2. What is meant by the statistical analysis of a group of data? 

3. What are the purposes of a graphical presentation of a set of sta- 
tistical data? 

4 . What is a histogram? A frequency polygon? Give directions for 
constructing each. 

6. What is meant by: “The central tendency of a distribution”? 
The “dispersion of a distribution”? The “skewness of a distribution”? 

6. Define three measures of central tendency; three measures of dis- 
persion; three measures of skewness. 

7. From the formula defining the arithmetic mean, derive another 
formula for M . 



X70 SKEWNESS: EXCESS: MOMENT 

8 . If a — 127 ± 0.2 and b = 2.2 ± 0.3, find the extreme values of 

(1) a + b (2) a - b (3) a ■ b (4) - 

9. In what type of distributions are Q\ and Q 3 equally distant from Mil 

10. From the formula defining cr, derive two other formulas for com- 
puting cr. 

11. To compute cr, is it necessary to compute M ? 

12. A measurement of the length of a room is recorded 22.6 db 0.06 feet. 
What does this statement mean? 

13. The arithmetic mean of a sample distribution of 100 grades is 
written 70 =fc 0.6. What does this statement mean? What is the standard 
deviation of the sample? 

14. The standard deviation of a sample distribution of 100 grades is 
written 8 zb 0.38. What does this statement mean? What is the standard 
deviation of the sample? 

16. Why is the standard deviation a good measure of dispersion? 

16. Criticize the following statements: 

(1) The range is the most perfect measure of variability because it 
includes all the measurements. 

(2) In constructing a frequency distribution the selection of the class 
interval is arbitrary. 

(3) If the probable error of the mean is attached to the computed mean, 
the true mean is then exactly known. 

(4) If the sum of the frequencies is equal to the count of the original 
scores, the tabulation is correct. 

(5) A score recorded as 80 means that the measure extends from 80 to 81. 

(6) If a class is designated “85-89,” the correct midpoint would always 
be 87.5. 

(7) M dr <r establishes an interval that always includes about (§)iV. 

17. The analysis of an approximately normal distribution of the weekly 
salaries of 600 men gave M = 830 and a = $5. 

(1) About how many received salaries between $25 and $35? 

(2) Assuming that Range = 6cr, about what was the maximum salary? 
The minimum salary? 

18. The heights and weights of 1,515 men gave two approximately 
normal distributions with the following statistical constants: 


Heights 

N = 1515 
M — 67.92 inches 
M d = 68.02 inches 
a = 2.43 inches 


Weights 
N = 1515 

M ~ 138.88 pounds 
Mi = 137.62 pounds 
cr = 17.2 pounds 


(1) Which distribution shows the greater dispersion? Why? 

(2) Which distribution shows the greater skewness? Why? 



RETROSPECT AND PROSPECT 


171 


X 

/(*) 

3.85 

3 

4.05 

41 

4.25 

127 

4.45 

303 

4.65 

524 

4.85 

852 

5.05 

1033 

5.25 

1106 

5.45 

1137 

5.65 

983 

5.85 

799 

6.05 

532 

6.25 

281 

6.45 

177 

6.65 

80 

6.85 

37 

7.05 

16 

7.25 

3 

7.45 

3 

Total 

8037 

X 

fix) 

51 

4 

52 

23 

53 

59 

54 

108 

55 

224 

56 

257 

57 

230 

58 

110 

59 

38 

60 

16 

61 

2 

Total 

1071 


The accompanying distribution gives the per- 
centage fat content of milk as shown by 8,037 
milking records. The data were taken from 
Bulletin 245 of the University of Illinois Agri- 
cultural Experiment Station, p. 603. 

Compute: M, Md, M 0 by fitting a parabola, 
(in Qh <r, and Sk. 

Find Em and interpret it. Find E a and 
interpret it. 


The data in the accompanying table give the 
head circumference (centimeters) of 1,071 boys. 
The data were taken from: “The Evaluation 
of Anthropometric Data,” by Winfield S. Hall, 
Journal of American Medical Association , Voi. 
37, p. 1646. 

Find M y a, as and c *4 for this distribution. 


21. What are two points of view that may be adopted with regard to 
the statistical analysis of a set of data? 

22. Does <t meet Yule's requirements of a good average? 

23. A class was given two tests with the following results: Mi = 76, 
cri =- 11; Mi = 59, a 2 = 14. A student made 92 on the first test and 
82 on the second tost. On which test did he do better? 

24. The following distribution presenting the life experience of wooden 



172 


SKEWNESS: EXCESS: MOMENT 


telephone poles was adopted from Robley Winfrey and Edwin B. Kurtz: 
Life Characteristics of Physical Property, Bulletin 103, Iowa Engineering 
Experiment Station, p. 57. Compute M, a, Em and E a - 


Life in Years 

X 

Number of Poles 
Replaced 
fix) 

Life in Years 
X 

Number of Poles 
Replaced 
fix) 

1 

4 

12 

95 

2 

7 

13 

91 

3 

15 

14 

73 

4 

32 

15 

64 

5 

30 

16 

38 

6 

57 

17 

30 

7 

61 

18 

18 

8 

73 

19 

5 

9 

96 

20 

1 

10 

104 

21 

1 

11 

103 

22 

2 

Total 



1000 


25 . Criticize the following statements: 

(1) The number 2.340 has four significant figures. 

(2) The relative error in a measurement is the ratio of the absolute 
error to the true value of the quantity measured. 

(3) The population of a city was recorded as 300,000 ± 3,000. The 
percentage error was 3 per cent. 

(4) The length of a line was measured twenty times. The arithmetic 
mean of the measurements gives the true length. 

(5) In our notation X indicates class frequency. 

(6) The guessed mean, h , should be chosen at the midpoint of a class 
interval. 

(7) Ordinarily the number of class intervals should be more than ten 
and less than thirty. 

(8) It would be possible for three people to get three different frequency 
distributions from the same data and all be right. 

(9) If the sum of the frequencies agrees with the count of the original 
measurements, the tabulation of the frequency distribution is 
correct. 

(10) The quartile points are used to measure both dispersion and skew- 
ness. 

(11) No matter what value of h is chosen, the same result will be ob- 
tained for or if the computation is correct. 

( 12 ) In symmetrical distributions the first and third quartile points are 
equidistant from M d . 



RETROSPECT AND PROSPECT 


173 


(13) The standard deviation is a point, not a distance. 

(14) The range of a mound-shaped distribution equals 3<r approximately 

(15) The probable error of the mean shows what mistake was probably 
made in computing M. 

(16) The statement M = 75 ± 3 means that the true value of M lies 
between 72 and 78. 

(17) When M is greater than M d , the skewness is positive. The skew- 
ness is also positive if q 2 is greater than q x . 

(18) If the probable error is attached to a statistical constant, the 
results are then exact. 

(19) A distance of 3<x laid off on both sides of M establishes an interval 
that includes about 99 per cent of the total frequency of a mound- 
shaped distribution. 

(20) For a manufacturer of hats, the mode is a more important measure 
of central tendency than the arithmetic mean. 

(21) M h of a group of numbers is the reciprocal of M of the group. 

(22) M h of the numbers 2, 3, and 6 is greater than their M. 

26 . The data of the following tables are taken from Bulletin No. 623 
of the U.S. Department of Labor, “ Wages, Hours, and Working Conditions 
in the Bread-Baking Industry, 1934.” They present the hourly earnings 
in December, 1934 of employees distributed as to sex. 

Compute M , M d} a, and Sk for each distribution. 


Class 

(cents) 

Males 

fix) 

Females 

fix) 

Males and Females 

fix) 

0 a.u. 12.5 

1 

0 

1 

12.5 a.u. 17*.5 

6 

1 

7 

17.5 a.u. 22.5 

14 

3 

17 

22.5 a.u. 27.5 

148 

165 

313 

27.5 a.u. 32.5 

509 

635 

1144 

32.5 a.u. 37.5 

1517 

746 

2263 

37.5 a.u. 42.5 

2615 

545 

3160 

42.5 a.u. 47.5 

2325 

205 

2530 

47.5 a.u. 52.5 

1853 

138 

1991 

52.5 a.u. 57.5 

1698 

69 

1767 

57.5 a.u. 62.5 

1387 

34 

1421 

62.5 a.u. 67.5 

1418 

32 

1450 

67.5 a.u. 72.5 

1169 

10 

1179 

72.5 a.u. 77.5 

1052 

9 

1061 

77.5 a.u. 85.0 

876 

6 

882 

85.0 a.u. 100 

1148 

13 

1161 

100 a.u. 120 

465 

3 

468 

120 a.u. 150 

147 

0 

147 

Total 

18348 

2614 

20962 



Chapter 6 

INDEX NUMBERS 1 
46. INTRODUCTION 

In the preceding chapters we have devoted no little attention to var- 
iation as a characteristic of statistical phenomena. In characterizing 
a frequency distribution, we devoted an entire chapter to the measure- 
ment of dispersion, a measurement of the extent to which the indi- 
vidual items vary on the average from the arithmetic mean. From one 
point of view, simple correlation is a study of the variation that occurs 
on the average in one variable when a linearly related variable changes 
by a given amount. In the study of the normal curve, we must have 
been impressed with the fact that the equation defining this curve de- 
scribes a very particular kind of variation of a group of measurements 
from their arithmetic mean. Our formulas for estimating reliability 
are efforts to define a range of variation about a statistical constant 
within which fluctuations , due to pure chance, may be expected to 
occur according to definite probabilities. Each of these important 
statistical concepts emphasizes, therefore, a particular kind of varia- 
tion. Speaking rather broadly, we may say that statistical analysis 
is largely a study of variation in statistical phenomena. 

In this chapter we shall still be concerned with variation as a char- 
acteristic of our data, but we shall regard the variation in a different 
manner than we have done previously. Stated in rather general 
terms, our present objective is the reduction of series of data, more 
or less complex, to numbers purely relative which will facilitate com- 
parison. Thus, we shall be interested primarily in measuring relative 
variations in the magnitudes of statistical groups. The statistical de- 
vices by which we do this are called index numbers. 

47. RELATIVES 

In their simplest forms, index numbers are ratios, generally ex- 
pressed as percentages, of one quantity to another quantity of the 

1 This chapter may be omitted without destroying the continuity. 

174 



RELATIVES 


175 


same kind called the base. Index numbers have been most widely 
employed in the study of price changes, but they also may be em- 
ployed in the study of variation in unemployment, in production, in 
building, in manufacturing, — in short, wherever group movements 
are to be measured. 


Table 30. Production of Motor Vehicles in the 
United States, 1920-1929 1 


Year 

(1) 

Number 
(in thousands) 
(2) 

Relatives 
to 1920 

(3) 

Link 

Relatives 

(4) 

1920 

2227 

100 


1921 

1682 

76 

70 

1922 

2646 

119 

157 

1923 

4180 

188 

158 

1924 

3738 

168 

89 

1925 

4428 

199 

118 

192G 

4506 

202 

102 

1927 

3580 

161 

79 

1928 

4601 

207 

129 

1929 

5622 

252 

122 


Consider the data of Table 30. Column (2) gives the total pro- 
duction ( aggregates ) of motor vehicles produced in the United States 
in the years 1920-1929. It is readily observed from column (3) 
that a comparison of the values for different dates with the value at 
some fixed base, or a study of the variation in production relative to 
some fixed base, is greatly facilitated by reducing the several aggre- 
gates to a series of percentages ( relatives ). If the production in 1920 
is taken as the date production and is represented by 100, the pro- 
duction relative for any other year merely expresses the production 
of that year as a percentage of the production for the base year. 
That is, 


Relative for a given year 


Prod uction for given year ^ 
Production for base year 


Thus, each item in column (3) is the ratio of the corresponding item 
in column (2) to the 1920 production, expressed as a percentage. 


1 The data are taken from Statistical Abstract of the United States , 1930, p. 385. 



176 


INDEX NUMBERS 


If it is desired to compare the values for each year with those of 
the preceding year, a link relative may be employed. The link relative 
for any year is constructed by dividing the value in that year by the 
value in the preceding year, and expressing the result as a percentage. 
That is, 


Link relative for a given year 


Value for given year 
Value for preceding year 


X 100 


Thus, in Table 30, the link relative for 1922 is ffff X 100 = 157. 
In order to distinguish them, the relatives shown in column (3) are 
called fixed-base relatives. 

The link relatives thus establish a chain of relatives, each year being 
tied to the preceding year, and from the link relatives we may obtain 
a further set of relatives called chain relatives. We assign 100 as the 
chain relative for the first year and define the chain relative for any 
other year to be the product of the link relative for that year and the 
chain relative for the preceding year, the product to be divided by 
100. It should be evident from the definitions that, when a single 
commodity is involved, the chain relatives are equal to the fixed- 
base relatives. 

Simple relatives may be employed to compare the fluctuations in 
two or more variables, and to permit the computation of an average 
price relative. To facilitate the comparison of the fluctuations in the 
prices of corn and hogs in the United States for the decade 1920- 
1929 — see Table 31 — we have computed their fixed-base relatives, 
shown in columns (4) and (5), with the prices in 1920 as the base 
prices. It can now be seen at a glance how one set of relative prices 
changes as compared with the other. To explain the behavior of 
the fluctuations recorded in the table would require other data that 
are not included here. The numbers in column (6), which are the 
arithmetic means of the numbers in columns (4) and (5), give the 
average price relatives based upon the two given commodities. Thus, 
the general average price of these two commodities was 3 per cent 
higher in 1924 than in 1920, and was 19 per cent lower in 1923 than 
in 1920. 

The price relatives in Table 31 have been based upon the prices 
of 1920. Of course the prices for any other year could have been 
chosen as the bases. The averages of the decade prices, 70.3 cents 



RELATIVES 


177 


Table 31. Prices of Corn and Hogs in the United States 
for the Years 1920-1929, and Their Relatives 1 


Year 

(1) 

Corn 
(cents per 
bushel) 

(2) 

Hogs 

(dollars per 
100 pounds) 
(3) 

Relatives (1920 = 100) 

Average 

price 

relative 

(6) 

Price of corn 
(4) 

Price of hogs 
(5) 

1920 

67.2 

13.01 

100 

100 

100 

1921 

42.3 

8.51 

63 

61 

62 

1922 

65.8 

9.22 

98 

! 66 

82 

1923 

72.6 

7.55 

108 

54 

81 

1924 

98.2 

8.11 

147 

58 

103 

1925 

67.4 

11.81 

101 

85 

93 

1926 

64.2 

12.34 

96 

89 

93 

1927 

72.3 

9.95 

108 

72 

90 

1928 

75.2 

9.22 

112 

66 

89 

1929 

78.1 

10.16 

117 

73 

95 

Totals 

703.1 

100.78 

1050 

724 

888 

Means 

70.3 

10.08 

105 

72 

89 


per bushel for corn and 10.08 dollars per hundred pounds for hogs, 
would have been more satisfactory bases since they are representa- 
tive and are less affected by chance variations. 

EXERCISES 

1. Compute the fixed-base relatives (1909 = 100) for the data of Table 
11, page 45. 

2. Compute the fixed-base and the link relatives (1909-1910 = 100) 
for the data of Exercise 18, page 106. 

3. With the arithmetic mean of the production as base, compute the 
fixed-base relatives for the data of Exercise 12, page 57. 

4 . Using the arithmetic means of columns (2) and (3) as bases, compute 
the average price relatives for the data of Table 31. 


48. DEFINITIONS AND NOTATION 

We have defined index numbers to be devices which summarize 
the relative fluctuations in a group of variables. Inasmuch as the 
essential purpose of an index number is to measure the variation in a 

1 The data are taken from Statistical Abstract of the United States, 1930, p. 682 
and p. 661. 



178 


INDEX NUMBERS 


group of variables, it is probably better practice to employ the terms 
“ relative numbers” and “ relatives” when referring to single series 
in terms of a fixed base, and to reserve the term “ index number” to 
describe the variation in a group of variables in combination. The 
numbers in column (3) of Table 30 may properly be called “ rela- 
tives” whereas those of column (6) of Table 31 may properly be 
called “index numbers.” While index numbers are sometimes ex- 
pressed as mere aggregates, yet more generally they are expressed as 
percentages of the values in an arbitrarily chosen base period. 1 

Many methods may be employed in the construction of index 
numbers, and there are differences of opinion as to which is the best 
method. In our treatment, we shall devote the emphasis to the best 
known methods of construction and attempt to avoid controversial 
questions. We shall make use of the following symbols: 

Po = price of the first commodity at time “0” (the base period) 
pi = price of the first commodity at time “i” 
pOi) = p r i ce 0 f the nth commodity at time “i” 
q'o = quantity of the first commodity at time “0” 

q'i = quantity of the first commodity at time “i” 

q ( i )=: quantity of the nth commodity at time “i” 
rf 

~ = a price relative (ratio of price of a given commodity at time 
“i” to the price of the same commodity at time “0,” ex- 
pressed as a percentage) 

(j 

= a quantity relative 
<Zo 

2p>qi = vWt + VWI + • • • + pTqV 

— plq'o + p" q'o + * * • + p { Vq { $ 

0 Pi = the price index for the time “i” 

= the quantity index for the time “i” 


49. UNWEIGHTED INDEX NUMBERS 

In the construction of unweighted (or simple) index numbers, the 
individual members of the group are all regarded as of equal im- 
portance. The influence of no member of the group is to be weighted 

1 In this book we shall assume that relatives and index numbers are expressed 
as percentages. 



UNWEIGHTED INDEX NUMBERS 


179 


by multiplying the member by some quantity or weight. If some 
members of the group are to be considered as more important than 
others, we shall apply to the important members weights that are 
expected to reflect their relative importance. Unweighted indices 
will be considered in this section; weighted, in the next. 

A, Simple Aggregative Relatives. An aggregative index number 
is based upon the sums (aggregates) of the items for the several years. 
The aggregative relative is found by comparing the results thus secured 
for different dates. If prices are in question, the aggregative relative 
is given by 

>’A. 

( 1 ) 




To illustrate the method of computing aggregative relatives, let us 
consider the data of Table 32, which gives the farm prices in cents 


Table 32. Farm Prices in Cents per Bushel of Grains 
in the United States 1 


Computing the aggregative relatives 


Grain 

1921 

1923 

1925 

1927 

1929 

Corn 

42.3 

72.6 

67.4 

72.3 

78.1 

Wheat 

92.6 

92.3 

141.6 

111.5 

104.3 

Oats 

30.2 

41.4 

38.0 

45.0 

43.5 

Rye 

69.7 

65.0 

78.2 

85.3 

87.1 

Barley 

41.9 

54.1 

58.8 

67.8 

55.0 

Buckwheat 

81.2 

93.3 

88.8 

83.5 

97.7 

Rice 

95.2 

110.2 

153.8 

92.9 

97.8 


453.1 

528.9 

026.fi 

558.3 

563.5 

P _2p. 

0* » ~ 

Zp 0 

100 

117 

138 

123 

124 


per bushel of seven important grains. We find the aggregates 2p*, 
of the prices for each of the several years. Choosing 1921 as the 
base year (where i = 0), we find the aggregative relatives 2p» / 2 p 0 
for the other years and express our results as percentages. We note 
that the aggregative relative for 1925 is 138. This may be interpreted 

1 The data are taken from Statistical Abstract of the United States , 1930, pp. 682- 
683. 



180 


INDEX NUMBERS 


to mean that the farm prices of these grains for 1925 were, on the 
average, 38 per cent higher than for 1921. 

It is evident that the computation of the aggregative relative re- 
quires that all items be reduced to the same unit, otherwise we would 
be combining non-homogeneous things and the sums would have no 
meaning. 1 To illustrate, consider the following prices of several 
commodities in 1925: 

Anthracite coal $5.30 per ton (2000 pounds) 

Cotton 0.182 per pound 

Potatoes 2.10 per bag (100 pounds) 

Wheat 1.60 per bushel (60 pounds) 

We may reduce these prices to the same unit, and quote them as 
follows: 

Anthracite coal $00,265 per 100 pounds 

Cotton 18.20 per 100 pounds 

Potatoes 2.10 per 100 pounds 

Wheat 2.67 per 100 pounds 

The well-known BradstreeVs index is based upon the simple aggre- 
gative method, the items being reduced to prices per pound. The 
aggregates Sp t , themselves, are the indexes; however, they may be 
converted into a series of percentages upon any chosen base. It 
should be noted that the conversion of all prices into prices per pound 
affects a concealed weighting for which there is no logical basis. 
Thus, in 1925 in an aggregate of per pound prices, a pound of cotton 
was worth 9 times as much as a pound of potatoes and 69 times as 
much as a pound of coal. This illogical emphasis given to high-priced 
articles is somewhat neutralized in BradstreeVs index by the intro- 
duction of a logical element in that more than one quotation is given 
for some of the more important commodities and only one for the 
less important articles. 

B. Simple Average of Relatives. Another method of constructing 
index numbers is that of finding some simple average of the relatives 
for the given items, the relative for a given commodity at a given 
time being referred to the same commodity at a certain basic date. 
We may use the arithmetic mean, the geometric mean, the median, 

x It should not be assumed that an aggregative relative based upon such a re- 
duction will necessarily present a logical index. 



UNWEIGHTED INDEX NUMBERS 


181 


the mode, and the harmonic mean of the relatives. Assuming that a 
table of actual amounts has been prepared — such, for example, as 
the prices of Table 32 — the steps involved in the process are: 

1. Reduce each item, — price, quantity, value, et cetera, — in the time 
“t” for which the index is desired to a percentage (relative) of the 
item for the same commodity in the base period. That is, if prices 
are in question, find p x /po for each commodity; if quantities are in 
question, find q x /qo for each commodity; if values are in question, 
find v x /v 0 , and express all the relatives as percentages . 

2. Compute the averages of the relatives found. 

The arithmetic mean of the price relatives at time “i” is given by 

• i> ‘ = w2£; < 2 > 

where N is the number of prices. 

The geometric mean of the N price relatives at time “i” is given by 



where II means “the product of such terms as,” and is computed 
with the aid of logarithms. 

The median of the relatives at time “i” is, of course, found by 
arranging the relatives at time “i” in the order of their magnitude. 
If N is odd, the middle term is the median. If N is even, we define 
the median to be one half the sum of the two middle terms. 

The harmonic mean of the relatives at time “i” is given by the 
formula 


o Pi = 


N 


Po+Pl , 
Pi Pi 


,PT 
+ p u P 


N 


( 4 ) 


" In column (6) of Table 31 we have shown the arithmetic means 
of the relatives for the prices of two commodities, corn and hogs, for 
the years 1921 to 1929 with the year 1920 as the base. When several 
commodities are being investigated, it is better to arrange the table 
with the list of commodities in the stub and the “times” in the box 
heads as was done in Table 32. 

As an illustrated problem, consider Table 33 which gives, in the 



182 


INDEX NUMBERS 


Table 33. Price Relatives of Grains in the United States, 
Based upon Table 32 


(1921 = 100) 

Computing simple averages of relatives 


Grain 

1921 

1923 

1925 

1927 

1929 

Corn 

100 

172 

159 

171 

187 

Wheat 

100 

100 

153 

120 

113 

Oats 

100 

137 

126 

149 

144 

Rye 

100 

93 

112 

122 

125 

Barley 

100 

129 

140 

162 

131 

Buckwheat 

100 

115 

109 

103 

120 

Rice 

100 

116 

162 

98 

103 

Totals 

700 

862 

961 

925 

923 

Arithmetic Mean of 






relatives 

100 

123 

137 

132 

132 

Median of relatives 

100 

116 

l 

140 

122 

125 

Geometric Mean of 


1 




relatives 

100 

121 

136 

130 

130 


body of the table, the relatives of the farm prices of seven important 
grains in the United States. These data were derived from Table 32 
by methods previously explained. Thus, the 

ijn /> 

price relative for corn in 1923 = X 100 = 172 

97 7 

price relative for buckwheat in 1929 = x 100 = 120 

ol.2 

We continue this process until the table of relatives is complete. 
We then compute the averages for the several years. For example, 
the geometric mean of the relatives for 1923 is given by 

0 Pi = 'V / 172 • 100 • 137 • 93 • 129 • 115 • 116 
or by 

log o Pi = 70og 172 + log 100 + log 137 + log 93 + log 129 

+ log 115 + log 116] 



UNWEIGHTED INDEX NUMBERS 


183 


Numbers 

Logarithms 

172 

2.23 553 

100 

2.00 000 

137 

2.13 672 

93 

1.96 848 

129 

2.11 059 

115 

2.06 070 . 

116 

2.06 446 

7 14.57 648 
log qP x == 2.08 235 


o Pi = 120.9 

EXERCISES 

1. Find the aggregative relatives for the production data given m 
Table 34 using 1921 as the base year. 

Table 34. Production, in Millions of Bushels, of Grains 
in the United States 1 


Grain 

1921 

1923 

1925 

1927 

1929 

Corn 

30C9 


2916 

2763 

2622 

Wheat 

815 

! 797 

677 

878 

807 

Oats 



1488 

1183 

1239 

Rye 

62 


46 

58 

41 

Barley 

155 

198 

214 

266 


Buckwheat . . . 

14 

14 

14 

16 

12 

Rice 

38 

34 

33 

45 

40 

Totals 

5231 

5466 

5388 

5209 

5068 

Aggregative 

Relative 







2. Compute the harmonic mean of the relatives for the data of Table 32. 

3. Compare the five index numbers that have been computed for the 
data of Table 32. 

4 . Verify the production relatives given in Table 35. For this table, 
compute (1) the arithmetic means, (2) the medians, (3) the geometric 
means of the relatives for the years 1923, 1925, 1927, and 1929. 

1 The data are taken from Statistical Abstract of the United States. 1930, pp. 
682-683. 



184 


INDEX NUMBERS 


Table 35. Production Relatives of Grains in the United States, 

Based upon Table 34 


(1921 * 100) 


Grain 

1921 

1923 

1925 

1927 

1929 

Corn 

m 100 

100 

95 

90 

85 

Wheat 

100 

98 

83 

108 

99 

Oats 

100 

121 

138 

110 

115 

Rye 

100 

102 

74 

94 

66 

Barley 

100 

128 

138 

172 

198 

Buckwheat 

100 

100 

100 

114 

86 

Rice 

j 100 

89 

87 

118 

105 


50. WEIGHTING 

In our previous discussion we attempted to regard all items as of 
equal importance, although a concealed, “unconscious” weighting 
was conceded. We admitted the existence of weights inherent in 
the data themselves and called attention to the fact that doing 
nothing about them may lead to illogical results. We shall now 
consider how the illogical results may be somewhat eliminated by 
the process of weighting. Weighting is the term used to describe the 
conscious effort to assign to each commodity an influence that, in 
the final result, is proportionate to its relative importance. The 
index number that results from conscious weighting is called a 
weighted index number. When no such conscious endeavor is made 
and each item is permitted to exercise an influence upon the result 
presumably equal to that of every other item, the index is said to 
be unweighted or simple. 

The weights are usually determined upon some rational basis such 
as the quantities produced or consumed in a representative period, an 
average of the quantities produced or consumed over several periods, 
or some other criterion. As multipliers, it is obvious that the weights 
may be abstract numbers, and thus that the weights may be numbers 
proportional to the quantities produced or consumed. The fact that 
actual quantity figures of production and consumption have become 
increasingly available within recent decades has tended to encourage 
their use as weights in index number construction. Two methods 



WEIGHTING 185 

of weighting by quantity figures are widely used: the first is called 
“ weighting by base period quantities,” and the second is called 
“ weighting by given period quantities.” A third method, that of 
weighting by an average of the base period quantities and the given 
period quantities, is growing in favor. 

There are two reasons for weighting by base period quantities. 
In the first place, despite the increasing availability of quantity 
figures, they are not easily obtained for many commodities for the 
given period. In the second place, the relative variations in the 
quantities from period to period are frequently not sufficiently large 
to result in significant errors in the indexes when the quantities are 
assumed constant for a few successive periods. 


51 . WEIGHTED AGGREGATES 


If we employ the quantities produced in the base period q 0 as 
weights, the weighted aggregative relative index number of prices 
at time “i” is given by 


^pjQo 


which is merely the ratio of the weighted aggregate at time “i” to 
the total value in the base period. This is possibly our most widely 
used index number. 

We shall illustrate the use of this formula in Table 36 by con- 
structing the index number based upon the weighted aggregate of 
actual prices in cents per bushel of grains in the United States for the 
year 1925 with the year 1921 as the base year. The data are taken 
from Tables 32 and 34. 

If we employ the quantities produced in the given period as 
weights, the weighted aggregative relative index number of prices 
at time “i” is given by 


oft = 


2 *,® 

2M* 


(5b) 


We shall find that the weighted aggregative relatives, (5a) and 
(5b), are basic formulas for the “ Ideal” index given in Section 55. 

To illustrate the construction of an index of weighted aggregates 
based upon formula (5b), we request the student to complete Table 
37. The data are taken from Tables 32 and 34. 



186 


INDEX NUMBERS 


Table 36. Index Number of Grain Prices in the United 
States for 1925 

(1921 = base year) 


Weighted aggregative method 


Grain 

Price 1921 

Po 

Weight 

Qq 

Price 1921 
times 
Weight 

Poqo 

Price 1925 

Pi 

Price 1925 
times 
Weight 

PiQ o 

Corn 

42.3 

3069 

129 818.7 

67.4 

206 850.6 

Wheat 

92.6 

815 

75 469.0 

141.6 

115 404.0 

Oats 

30.2 

1078 

32 555.6 

38.0 

40 964.0 

Rye 

69.7 

62 

4 321.4 

78.2 

4 848.4 

Barley 

41.9 

155 

6 494.5 

58.8 

9 114.0 

Buckwheat 

81.2 

14 

1 136,8 

88.8 

1 243.2 

Rice 

95.2 

38 

3 617.6 

153.8 

5 844.4 

Totals 



253 413.6 


384 268.6 


Price 1921 is in cents per bushel. 

Price 1925 is in cents per bushel. 

Weights are quantities produced in 1921 in millions of bushels . 
2po?o = 253 423.6 2p t q 0 = 384 268.6 

D _ 384 268.6 _ 1K1 c . 

“ 2p 0 <2o “ 253 413.6 ~ b 

Table 37. Production and Price of Grains in the 
United States in 1921 and 1925 


Grain 

Price 

(cents) 

Production 
(millions of bushels) 



1921 

1925 

1921 

1925 




Po 

p> 

qo 


PofU 

P«l< 

Corn 

Wheat 

Oats 

Rye 

Barley 

Buckwheat 

Rice 

42.3 

1 92.6 

30.2 
69.7 
41.9 

81.2 
95.2 

67.4 

141.6 

38.0 

78.2 

58.8 

88.8 
153.8 

3069 

815 

1078 

62 

155 

14 

38 

2916 

677 

1488 

46 

214 

14 

33 



Totals 









WEIGHTED AGGREGATES 


187 


EXERCISES 

1. (Lovitt and Holtzclaw.) The following table gives the average price 
and weights (quantity used per year by the average workingman’s family) 
of several important items of food. Using 1913 as the base year, compute 

(a) the simple aggregative relative of prices for 1915, 1918, 1920 and 
1922, 

(b) the simple arithmetic mean of relatives for 1915, 1918, 1920 and 
1922, 

(c) the weighted aggregative relative index for the years 1920 and 1922. 


Commodity 

Unit 

Weights 

Prices 

\ 

1918 

1915 

1918 

1920 

1922 

Sirloin Steak 

lb. 

15 

$0,254 

$0,257 

$0,389 

$0,437 

$0,374 

Round Steak 

lb. 

40 

.223 

.230 

.369 

.395 

.323 

Bacon 

lb. 

13 

.270 

.269 

.529 

.523 

.398 

Eggs 

doz. 

70 

.345 

.341 

.569 

.681 

.444 

Butter 

lb. 

76 

.383 

.358 

.577 

.701 

.479 

Milk 

qt. 

424 

.089 

.088 

.139 

.167 

.131 

Flour 

£bbl. 

8 

.809 

1.029 

1.642 

1.985 

1.250 

Potatoes 

peck 

50 

.255 

.225 

.480 

.945 

.420 

Sugar 

lb. 

145 

.055 

.066 

.097 

.194 

.073 


2. The following were the retail prices of some foods during 1926 and 
1934: 


Commodity 

Round Steak 

Potatoes 

Beans 

Butter 

Coffee 

Flour 

Unit 

lb. 

lb. 

lb. 

lb. 

lb. 

lb. 

Price 1926 

$ 0.36 

$0.05 

$0.09 

$0.58 

$0.51 

$0.06 

Price 1934 

.28 

.02 

.06 

.31 

.29 

.05 


Compute the simple indexes to fill in the blanks below, and criticize them. 


A.M. of relatives 
G.M. of relatives 
Aggre. relative 


1926 = Base Year 
1926 1934 

100 

100 

100 


1934 = Base Year 
1926 1934 

100 

100 

100 



188 


INDEX NUMBERS 


52. WEIGHTED AVERAGES OF RELATIVES 

Weighted index numbers may also be computed from weighted 
averages of relatives. The averages most widely used are the arith- 
metic mean and the geometric mean. The formulas for the weighted 
arithmetic and the weighted geometric means are derived immediately 
from those given on pages 61 and 90 by simply considering the fre- 
quencies as weights. Thus, if Xi is any value and w t its weight, we 
have 

M = WlXl + WlXl +_J_+ W n X n _ 'ZwX _ HwX . 

Wi + W 2 + • • • + Wn 2 W N { 

for the weighted arithmetic mean, and 

M g = \ZXfxf ■ ■ ■ x w n * = tynxf (7) 

for the weighted geometric mean, where 

N = w t + w 2 + • • • + w n 
Written logarithmically, formula (7) becomes 

log M = — log Xl + w * lQg X * + ' - 1 + Wn log Xn 

2 tv log X 2 w log X /ON 

""2 w ~ N (8) 


The weighted harmonic mean is given by 

M = Wx Wl 1 * ' + _ N 

h ~ Uh , Uh , V — 

X % X 2 X n X X 


( 9 ) 


It should be emphasized that “in weighting individual price rela- 
tives, quantities will not serve. The abstract relatives must be 
weighted by values , if the resulting products are to be comparable. 
For values are in terms of a common dollar unit, while quantities 
may be expressed in a variety of units.” 1 

A. The Weighted Arithmetic Mean of Relatives. An index of 
this type may be obtained in several ways. We may weight each 
relative by base-period values, by given period values, or by an 
1 Frederick C. Mills, Statistical Methods , Revised, 1938, p. 195. 



WEIGHTED AVERAGES OF RELATIVES 189 

average of the base period values and the given period values. 
Weighting by base period values is the method most widely used. 

To compute a weighted arithmetic mean of relatives for time “i,” 
weighting by base period values, we multiply each relative Pi/po by 
the value p 0 £o of the corresponding commodity in the base period, 
and express the sum of the products as a relative of the total value in 
the base period. 

We shall illustrate the computation of this type of index in Table 
38 by constructing the index of the prices of grains in the United 
States in the year 1925 with the year 1921 as the base year. The 
data are taken from Tables 33 and 36. 

Table 38. Index Number of Grain Prices in the 
United States for 1925 


(1921 = base year) 

Weighted arithmetic mean of relatives 


Grain 

■ 

(1) 

Relative 
Price 1921 

(2) 

Relative 
Price 1925 

(3) 

W eight 

PoQo 

(4) 

Relative 
Price 1925 
times Weight 
(3) X (4) 

(5) 

Corn 

100 

159 

129 818.7 

20 641 173.3 

Wheat 

100 

153 

75 469.0 

11 546 757.0 

Oats 

100 

126 

32 565.6 

4 103 265.6 

Rye 

100 

112 

4 321.4 

483 996.8 

Barley 

100 

140 

6 494.5 

909 230.0 

Buckwheat 

100 

109 

1 136.8 

123 911.2 

Rice 

100 

162 

3 617.6 

586 051.2 

Totals 



253 423.6 

38 394 385.1 


The relative prices 1921 and 1925 were taken from Table 33. 

The weights, values of the respective grains in 1921, were taken from 
Table 36. 


2 (price 1925 X weight) = 38 394 385.1 
2 weight = 253 423.6 


38 394 385.1 
253 423.6 


151.6 


which is the same index as that secured from the computations of Table 36. 



190 


INDEX NUMBERS 


The equality of values for the indexes secured by the two methods 
illustrated in Tables 36 and 38 is not a coincidence for the weighted 
arithmetic mean of relative prices, weighted by the values in the base 
year, is always equal to the relative of aggregates weighted by base 
year quantities. For 


pi 

2o 


X p'tfl'o + % X p"q"o + • • * + X p < 0 ) 9 ,< 0 ) 


JSs. 


/mV* 

£i 


pko + Po?o + 


+ pWV 


Sp.'go 

Spogo 


or, more briefly, 

2? X PM n 

^ Vo = ^ViQo 

Xp 0 qo 2p Q q Q 

which was given in (5). 

The arithmetical computations of Table 38 could have been 
considerably reduced by replacing the weights p 0 qo in column (4) 
by the numbers 130, 75, 33, 4, 6, 1, 4 to which the weights are ap- 
proximately proportional (see Theorem II, p. 9). We shall leave it 
as an exercise for the student to show that the result is 


o Pi = 38 348/253 = 151.6 

In the construction of Table 38, we made use of relatives and 
values that previously had been computed from the original data and 
recorded in Tables 33 and 36. Generally, one is called upon to 
construct the index from the original data, and we suggest the 
following form for the work-sheet when the weights are the values of 
the respective commodities for the base year. Of course if the 
weights are numbers proportional to the values, columns (7) and 
(8) can be changed accordingly. For the greatest simplicity in com- 
putation, the weights should be expressed as percentages of 2p 0 (Zo. 
This will mean that the sum of the weights is 100, and the consequent 
division can be performed mentally. 

It is evident that the numbers in column (8), which are derived by 
multiplying columns (6) and (7), will give the actual values p t qo 
only when the relatives given in column (6) are accurate. Since the 
weights may be numbers proportional to poq 0 , column (8) should 
always be found by multiplying (6) and (7) and not by multiplying 
Pi and qo. 



WEIGHTED AVERAGES OF RELATIVES 


191 


Form for Computing Index Numbers 
Weighted arithmetic mean of relatives method 


Weights: Base year values 




Price 

Price 

Quantity 

Relative 

Price 

Given 

Year 


Product 
of Weiaht 

Commodity 

Unit 

Base 

Given 

Base 

Weight 

and 


Year 

Year 

Year 

Relative 

Price 



Po 

Pi 

Qo 

pi/p 0 

Poqo 


(i) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

1st Commodity 


Po . 

Pi 

Qo 

V'Jp'o 

pWo 

PlQo 

2nd Commodity 


Pi 

p"i 

Qo 

pi/ Pa 

pWo 

VlQo 

nth Commodity 


PV 

p' ( v 

qV 

pV/pV\ 

pW 

p ( Vq ( V 

Totals 









Description of data: 

Prices base year are in units. 

Prices given year are in units. 


Index number = 0 Pi — - 

^PoQo 

B. Weighted Geometric Mean of Relatives. A verbal interpre- 
tation of formula (8) will point out the steps to be taken in con- 
structing the weighted geometric mean of relatives. The steps are 
as follows: 

1. Compute the relatives for the period “i” for which the index is being 
constructed. 

2. Find the logarithm of each relative. 

3. Multiply each logarithm by the given weight. 

4. Add the results obtained in Step 3. 

5. Divide the total obtained in Step 4 by the sum of the weights. This 
gives log M g . 

6. Find the antilogarithm of the quantity obtained in Step 5. This is 
the weighted geometric mean of the relatives. 

We shall illustrate the computation of this type of index in Table 
39 by constructing the index of the prices of grains in the United 
States in the year 1925 with the year 1921 as the base year. The 
relatives have been computed in Table 33. We shall use as weights 



192 


INDEX NUMBERS 


the numbers 130, 75, 33, 4, 6, 1, 4 which are proportional to the 
actual values, given in Table 36, of the commodities in the base 
year. 


Table 39. Index Number of Grain Prices in the 
United States for 1925 

(1921 = base year) 


Grain 

(1) 

Relative 
Price 1925 

(2) 

Logarithm of 
the Relative 
Price 
(3) 

Weight 

(4) 

Logarithm 
times Weight 
(3) X (4) 

(5) 

Corn 

159 

2.20 140 

130 

286.18 200 

Wheat 

153 

2.18 469 

75 

163.85 175 

Oats 

126 

2.10 037 

33 

69.31 221 

Rye 

112 

2.04 922 

4 

8.19 688 

Barley 

140 

2.14 613 

6 

12.87 678 

Buckwheat 

109 

2.03 743 

1 

2.03 743 

Rice 

162 

2.20 952 

4 

8.83 808 

Totals 



253 

551.29 513 


Relative prices for 1925 were taken from Table 33. The weights are 
numbers proportional to actual values of the commodities produced in the 
base year. They were taken from Table 36. 


log M a = 


551.29 513 
253 


= 2.17 903 


M 0 = 151.2 


The student will note that the three index numbers we have 
computed for these data on grains show a slight variation. The 
methods used in Tables 36 and 38 result in an index of 151.6, whereas 
Table 37 gives 150.1 and Table 39 gives 151.2. To judge the relative 
merits of these indexes, we shall consider certain tests in Sections 54 
and 55. 

In the construction of Table 39, we made use of computations 
that previously had been made upon the original data. Generally, 
one is required to construct an index from the original data, and we 
suggest the following arrangement for the worksheet when computing 
an index from original data by means of the geometric mean of the 
relatives. 



Form for Computing Index Numbers 
Weighted Geometric Mean of Relatives Method 
Weights: Numbers Proportional to Base Year Values 


WEIGHTED AVERAGES OF RELATIVES 


193 




194 


INDEX NUMBERS 


53. SUMMARY AND EXTENSION 


In our treatment of the index numbers of prices, we have given 
consideration to the following types, to some of which we have de- 
voted considerable attention. In addition to the median of the 
simple relatives, we have considered the following: 


1. Simple aggregative relative: 


2po 


2. Simple arithmetic mean of relatives: t; 

po 


N / 

3. Simple geometric mean of relatives: y II 

4. Simple harmonic mean of relatives: 


Vi 

Vo 


N 


2 s 

5. Weighted aggregative relative: 

(weights = base period quantities) 

6. Weighted aggregative relative: 

(weights — given period quantities) 

7. Weighted arithmetic mean of relatives: 

(weights = base period values) 

8. Weighted geometric mean of relatives: log M g = 
(weights = base period values) 

Zyoqo 


X(p Q q 0 ) log 


2 yoqo 


9. Weighted harmonic mean of relatives: 

(weights = base period values) S(p 0 go) 


Vo 

Vi 


Other useful types may be developed by devising different systems 
of weights. Suppose we weight the base period prices p 0 by base 
period quantities q 0 , and the given period prices p, by the given 
period quantities q % . The ratio of the aggregate value in the given 
period to the aggregate value in the base period gives the value index 
o Vi. We thus have 


10. Weighted aggregative relative. Value index: 0 Vi 
(weights base period = base period quantities) 
(weights given period = given period quantities) 


ZViqi 

Zpoqo 



SUMMARY AND EXTENSION 


195 


Other aggregative relatives may be obtained by choosing as weights 
averages of the base period quantities go and the given period quan- 
tities g». Thus we may choose as weights 


go 4- qi 

— — , 



2gogi 

qo + qi 


which are respectively the arithmetic mean, the geometric mean, and 
the harmonic mean of g 0 and qi. Employing these weights, we have 
the additional aggregative indexes. 


11 . 


Weighted aggregative relative: 
(weights = (q 0 + gt)/2) 


2p, 


(go 4~ qi) 
2 


2p 0 


(go 4- q%) 
2 


2 P*(go + Qi) 
2p 0 (go + g») 


12. Weighted aggregative relative: ---- - 

(weights = V gogi) Spo^gogx 

13. Weighted aggregative relative: ~ — — = — — 

(weights = 2g 0 g t /(g 0 4- g») 2p 0 v po __Mi_ 

go 4- g» go 4- qi 


The formulas listed in 10, 11, 12 and 13 above take into account 
not only the varying prices but the varying quantities as well. They 
have the disadvantage of requiring the quantities qi at time “i” f 
which are not always available. The formula listed in 11, namely, 

p = 2p,(go + qd 
° * 2p 0 (<zo + g.) 


is the Fisher's 2153 which has met wide approval. 1 Due to its 
simplicity and the facility of its computation, Professor Fisher has 
proposed its use as a substitute for his “ Ideal" index (see page 198). 

In a similar manner we may construct other index numbers that 
are weighted averages of relatives by devising various systems of 
weights. In Section 52, we recommended the use of values as weights 
for the abstract relatives. Professor Fisher has outlined the following 
methods of weighting by values. 2 

1 Irving Fisher, The Making of Index Numbers, 1927, p. 284. 

2 Irving Fisher, op. cit., p. 54. 



196 


INDEX NUMBERS 


I. Each weight = base period price X base period quantity: potfo 

II. Each weight = base period price X given period quantity: 

III. Each weight = given period price X base period quantity: p x q 0 

IV. Each weight = given period price X given period quantity: p x q x 


We have previously used p 0 q 0 as weights for the relatives Pi/po 
in deriving the arithmetic mean of relatives given by 7, the geometric 
mean of relatives given by 8, and the harmonic mean of relatives 
given by 9. Let us now use the values p x qi of the given period as 
weights. We have „ 

14. Weighted arithmetic mean of relatives: — 

(weights = given period values p x q x ) 


15. Weighted geometric mean of relatives: 
(weights = given period values p x qi) 

16. Weighted harmonic mean of relatives: 
(weights = given period values p x q x ) 


log M g = 


Sp.g* log — 
Po 

'Zpxqi 


M h 


q, _ 2p x q x 

P2 

Vi 


EXERCISES 


1. Compute the value index for grains — formula 10, Section 53 — for 
the year 1925. The data are given in Table 37. 

2. Compute the weighted aggregative relative for the prices of grains 
by formula 11, Section 53. The data are given in Table 37. 

3. Table 40. Production and Farm Price of the 

Principal Farm Crops in the United States 


Crop 

Unit 

Production 
( millions ) 

Unit Price 
( dollars ) 

1913 

1921 

1929 

1913 

1921 

1929 

Corn 

bu. 

2447 

3069 

2622 

0.69 

0.42 

0.78 

Wheat 

bu. 

763 

815 

807 

0.80 

0.93 

1.04 

Oats 

bu. 

1121 

1078 

1239 

0.39 

0.30 

0.44 

Barley 

bu. 

178 

155 

307 

0.54 

0.42 

0.55 

Rice 

bu. 

26 

38 

40 

0.86 

0.95 

0.98 

Potatoes 

bu. 

332 

362 

357 

0.69 

0.92 

1.31 

Apples 

bu. 

145 

99 

143 

0.98 

1.68 

1.32 

Sweet Potatoes . 

bu. 

59 

99 

85 

0.73 

0.88 

0.95 

Cotton 

bale 
500 lbs. 

14 

8 

15 

61.00 

81.00 

82.00 

Tobacco 

lb. 

954 

1070 

1501 

0.13 

0.20 

0.10 

Hay 

ton 

64 

82 

102 

12.43 

12.10 

12.23 



SUMMARY AND EXTENSION 


197 


Using 1913 as the base year, compute, for Table 40 on the preceding 
page, indexes for the years 1921 and 1929: 

(1) by formula 5, Section 53 

(2) by formula 6, Section 53 

(3) by formula 11, Section 53 

(4) by formula 10, Section 53 

(5) by formula 8, Section 53 

(6) by formula 14, Section 53 

54. BIAS 

It is a well-known theorem of algebra that if A is the arithmetic 
mean, G the geometric mean, and II the harmonic mean of a set of 
numbers, then H < G < A} That is, the simple harmonic mean is 
less than the simple geometric mean which, in turn, is less than the 
simple arithmetic mean. In averaging a group of simple relatives, 
the arithmetic mean tends to give a value too large and the harmonic 
mean a value too small to be a fair representation of the facts. In 
more technical language, the arithmetic mean is said to have an 
upward bias and the harmonic mean a downward bias. In contrast 
with the weight bias , to be considered later, the bias arising from the 
form of average used is called type bias. 

The existence of bias in the simple arithmetic and the simple 
harmonic means can be explained in another manner, namely, 
through the use of the time reversal test. The time reversal test 
requires that the product of the index for any given period on the 
base period and the index for the base period on the given period 
should equal unity. In symbols, the time reversal test requires that 

oPi • iPo = 1 

It is very easy to show that the simple geometric mean of relatives 
satisfies this test and that the simple arithmetic and the simple 
harmonic means of relatives do not satisfy it. The simple geometric 
mean is thus without type bias. It has been observed by the makers 
of index numbers that, when the simple arithmetic mean and the 
simple harmonic mean are crossed (averaged) geometrically, the bias 
is considerably reduced. The fact that the simple geometric mean is 

1 For a proof, see Robert W. Burgess, Introduction to the Mathematics oj 
Statistics , 1927, p. 101. 



198 


INDEX NUMBERS 


without type bias means that the index number obtained as a simple 
geometric mean of relatives is independent of the period taken as 
a base. These facts give the geometric mean remarkable merit in 
index number construction. 

When weights are applied in the construction of index numbers, 
another kind of bias — weight bias — appears. Each system of 
weighting has its bias. Weighting the relatives by base period values 
p 0 qo produces downward bias while weighting the relatives by given 
period values p,q x produces upward bias. The weighted arithmetic 
mean and the weighted harmonic mean of relatives may have both 
type bias and weight bias. If the base period values are employed 
in the construction of the weighted arithmetic mean of relatives, the 
net bias will likely be small. Similarly, if given period values are 
employed as weights in the construction of the weighted harmonic 
mean of relatives, the net bias will likely be small. Further, as the 
net bias of the arithmetic mean of relatives, weighted by base period 
values, has been observed to be in the opposite direction to the net 
bias of the harmonic mean of relatives, weighted by given period 
values, crossing these two indexes geometrically should produce an 
index practically free from bias. 

55. FISHER’S IDEAL INDEX 

Let 

A = weighted arithmetic mean of relatives: 

(weights » base period values p 0 qo ) 

and 

H = weighted harmonic mean of relatives: 

(weights = given period values ) 

the geometric mean of A and H , V AH, 

'Zpxqo m 'SpiOi 
2p 0 ?o 2poqi 

is known as Fisher’s Ideal Index Number. 1 This index is not only 
the geometric mean of a weighted arithmetic mean and a weighted 
harmonic mean of relatives; it is clearly a geometric mean of two 

1 Irving Fisher, op. cit., p. 220. 









FISHER’S IDEAL INDEX 


199 


aggregative relatives. The formula requires both price and quantity 
data for each period to which the index applies. Since the data for 
quantities are frequently difficult to secure, the practical usefulness 
of the Ideal index is to some extent limited. 

Interchanging 0 and i throughout the formula, we have 


iPc 




2poqi 


2p t g 0 


Evidently 0 P t • ,P 0 equals unity, and hence the Ideal formula sat- 
isfies the requirements of the time reversal test. 

A second test of validity — a test strongly recommended by Pro- 
fessor Fisher — is the factor reversal test. This test, states Professor 
Fisher, “ought to permit interchanging prices and quantities without 
giving inconsistent results — i.e., the two results multiplied together 
should give the true value ratio.” 1 

If in the Ideal formula we replace every p by a q and every q by a p, 
we have 


0 Qi — 



2qiPi 

S q 0 pi 


Multiplying 0 P» and 0 Qi together, we have 


0 P% ' oQt = 


ggtgj 

2 Potfo 


which is the true value index. Consequently, the Ideal formula 
meets completely the factor reversal test. This means, of course, 
that the formula serves equally well for constructing indexes of 
quantities as for constructing indexes of prices, the quantity 
index being derived by interchanging p and q in the Ideal formula 
for o P^ 

None of the simple or weighted forms of the elementary indexes 
— arithmetic mean, harmonic mean, geometric mean — fulfill the 
requirements of the factor reversal test. It is thus obvious that the 
strong restrictions imposed by the factor reversal test compel its 
being ignored in the construction of many highly reputable index 
numbers. 


1 Irving Fisher, op. cit .. D. 72. 



200 


INDEX NUMBERS 


CONCLUSION 

In our treatment of index numbers, we have not attempted to do 
more than touch upon the important phases of the subject. No 
attempt has been made to make the treatment exhaustive. We have 
consciously tried to avoid controversial issues. The student who 
desires a comprehensive treatment of the subject should read the 
following treatises: 

Irving Fisher, The Making of Index Numbers , Houghton, Mifflin Company, 
1927. 

Wilford I. King, Index Numbers Elucidated , Longmans, Green and Company, 
1930. 

Wesley C. Mitchell, Index Numbers of Wholesale Prices in the United States 
and Foreign Countries , Bulletin Number 284 of the United States Bureau 
of Labor Statistics, 1921. 

C. M. Walsh, The Problem of Estimation, King and Son, London, 1921. 


EXERCISES 


1. If A is the simple arithmetic mean and H is the simple harmonic 
mean of a group of relatives, show that their geometric mean, V AH, is an 
index that fulfills the time reversal test. Evaluate this index for eliminating 
type bias. 

2 . If A is the simple arithmetic mean and II is the simple harmonic 

A + H 

mean of a group of relatives, show that their arithmetic mean, — - — » and 


their harmonic mean, 


2 AH 
A+H' 


do not satisfy the time reversal test. 


3. Using V p 0 p, as the base prices, the simple arithmetic mean of rela- 
tives for the base year and for the given year are respectively given by 


A 0 



A, 



Show that the index 7 = A,/A 0 fulfills the time reversal test. 

4 . Show that the simple geometric mean of relatives fulfills the time 
reversal test. 

5. Show that the simple arithmetic mean of relatives and the simple 
harmonic mean of relatives do not satisfy the time reversal test. 

6. Using the results of the computations of Tables 36 and 37, find 
Fisher's Ideal index for grain prices in the United States in 1925. 

7 . Using the results of Exercises 3 (1) and 3 (2), page 196, find Fisher's 



CONCLUSION 


201 


Ideal index for the prices of the principal farm crops in the United States in 
1921. Do the same for 1929. 


8 . Table 41. Production and Wholesale Price of Mineral 
Products in the United States for the Years 
1919, 1921, and 1923 


Product 

Unit 

Production 
( millions ) 

Unit Price 
(dollars) 

1919 

1921 

1923 

1919 

1921 

1923 

Pig Iron 

long ton 

30.G 

16.6 

40.0 

28.97 

22.58 

26.29 

Copper 

lb. 

128(5.0 

575.0 

1667.0 

0.19 

0.13 

0.15 

Anthracite Coal 

short ton 

88.1 

90.5 

95.5 

8.27 

10.53 

10.98 

Bituminous Coal 

short ton 

465.9 

415.9 

564.2 

2.34 

2.19 

2.27 

Coke 

short ton 

44.2 

25.3 

55.5 

4.58 

3.45 

5.34 

Petroleum 

bbl. 

378.4 

469.6 

732.4 

2.28 

1.70 

1.44 


With 1919 as the base year, compute indexes for the years 1921 and 1923: 

(1) by formula 5, Section 53 

(2) by formula G, Section 53 

(3) by the formula for Fisher’s Ideal using the results of the two pre- 
ceding indexes 

(4) by formula 10, Section 53 

(5) by formula 11, Section 53 

9 . (Davies and Crowder.) Compute the weighted aggregative relative 
index number of the prices of farm products in Iowa in 1925 on a 1910- 
1914 base. 


Commodity 

Weights 

<1 

Prices 

1910-1914- 

Po 

1925 

P<*7 

P*Q 

Hogs 

5.17 cwt. 

$7.30 

$11.08 



Cattle .... 

3.85 cwt. 

6.39 

8.43 



Sheep 

0.21 cwt. 

4.51 

7.48 



Corn 

24.98 bu. 

0.53 

0.86 



Oats 

19.12 bu. 

0.35 

0.39 



Wheat .... 

1.03 bu. 

0.85 

1.44 



Hay 

0.10 ton 

9.82 

11.23 



Butter .... 

40.62 lb. 

0.25 

0.41 



Eggs 

19.56 doz. 

0.17 

0.27 



Poultry . . . 

14.58 lb. 

0.10 

0.18 



. Total 








202 INDEX NUMBERS 

10. Show that the simple aggregative relative fulfills the time reversal 
test. 

11. Show that the weighted aggregative relative (weights the base 
period quantities) does not fulfill the time reversal test nor the factor 
reversal test. 

12. Show that the weighted aggregative relative (weights the given 
period quantities) does not fulfill the time reversal test. 

13. The following data are taken from the Statistical Abstract of the 
United States, 1936, p. 632. 

Compute the price indexes for these data by using (1) formula 5(a) 
and (2) formula 5(b). 


Grain 

Price 

(cents) 

Production 
(millions of bushels ) 


1980 

Po 

1934 

1930 

Qo 

1934 

<h 

Poqo 

PiQo 

Poqi 

PiQi 

Corn 

59.6 

81.5 

2080 

1478 





Wheat. . . . 

67.1 

84.8 

886 

526 





Oats 

32.2 

48.0 

1275 

542 





Rye 

44.5 

71.8 

45 

17 





Barley 

40.5 

68.6 

300 

117 





Buckwheat 

78.8 

58.6 

7 

9 





Rice 

. 

78.4 

79.0 

45 

39 





Total 











Chapter 7 

LINEAR TRENDS 

56. INTRODUCTION 

The foregoing chapters have been devoted mainly to the problem of 
securing a brief numerical description of the simple frequency dis- 
tribution. We have been enabled to describe the characteristic 
properties of a distribution — the central tendency, the variability, 
the skewness, and the excess — by means of a few statistical con- 
stants. More briefly, we may say that we have been able to compress 
the relevant information into four measures: 

M , a y a 3 , a 4 - 3, 

that are essentially the first four moments of the distribution. Addi- 
tional information could be secured by fitting an appropriate fre- 
quency function to the observed data. Inasmuch as the general 
problem of describing frequency distributions by means of equations 
is beyond the scope of this text, no such refinements will be generally 
attempted . 1 

Certain types of data, however, admit descriptions by means of 
simple equations, and it is to them that we now turn our attention. 

It should be kept in mind that our problem here is inverse to a 
kindred problem in elementary algebra. There we were given the 
equation that expressed the relationship between X and F. We 
found sets of values of X and F, plotted them, and drew the graph 
which was a pictorial representation of the relationship. Here, we 
have the pairs of values that have come from observation; we plot 
them. They seem to lie upon or nearly upon a regular curve; that is, 
there is an apparent mathematical relationship between the variables. 
What is the equation that expresses exactly or approximately this 
relationship? Is there a summarizing constant that can be used to 
measure the degree of this relationship? 

1 Distributions that may be appropriately represented by the normal curve are 
considered in Chapter 12. 


203 



204 


LINEAR TRENDS 


In this chapter we shall be concerned with data that, we assume, 
obey the simplest mathematical law, the linear or straight-line law. 
Before we proceed to the real problem of the chapter, it is advisable 
that we devote some attention to some of the analytical properties 
of a straight line. 

57. SOME CHARACTERISTIC PROPERTIES 
OF A STRAIGHT LINE 

If two values of a variable, X, are given, we denote their difference 
by AX (read: delta ex). This does not mean A multiplied by X. 
It is merely a short way of writing, “the difference between the two 
values of X.” Thus, if the values of X are 5 and 9, then: 

AX = 9-5 = 4 

In general, if Xi and X 2 are two values of X: 

AX = X 2 - Xi 

(Unless otherwise specified, a difference designated by A will be 
taken in the order second minus first.) Similarly, AF means “the 
difference between the two values of Y.” Thus, if the values of Y 
are — 2 and 4: 

AF = 4 - (- 2) = 6 

Consider the line AB of Figure 20 with the two points A (2, 3) and 
B (5, 7) upon it. 

Figure 20 



AX - 5 - 2 - 3 « AC - MN 
AF = 7- 3 = 4 = CB 



PROPERTIES OF A STRAIGHT LINE 


205 


For the more general case, we have for the two points Pi(X u Fi) 
and Pi(Xi, F 2 ). Here: 

AX = X t - Xi = PiC = MN, and 
AY = F 2 - Fi = CPi 
AY 

For any two points on a straight line, the ratio gives the slope 
of the line (see Figure 21). It is usually designated by m. Hence: 

m = slope of P,P, = (1) 


Figure 21 



Thus the slope of a line between two points is equal to the difference 
of the F-coordinates of the points divided by the difference of their 
X-coordinates, subtracted in the same order. It also means the change 
in F due to a unit change in X. 

EXERCISES 

Draw the lines determined by the following pairs of points, and find their 
slopes : 

1. (3, 2) and (5, 7) 4 . (- 2, 3) and (5, - 7) 

2. (-2,-3) and (3, 2) 6. (- 2, 3) and (2, 3) 

3. (3, 2) and (5, - 7) 6. (3, - 4) and (-2,-4) 

7. Construct a line through (0, 0) with the slope equal to 2. 

8. Construct a line through (0, 3) with the slope equal to 2. 

9. Assuming that F = 3X + 4 is the equation of a straight line, find 
its slope. (Hint: Find two points upon the line.) 

10. Assuming that F = — 2X + 4 is the equation of a straight line, 
find its slope. 

11. Prove by means of slopes that (1, — 3), (2, 3), and (3, 9) lie on the 
same straight line. 



206 


LINEAR TRENDS 


It was probably observed in the exercises on page 205 that the 
slope of a line may be positive, negative, or zero. If the line rises as 

we proceed from left to right, A 7 and 
AX have the same sign, and the slope 
is positive. If the line falls as we pro- 
ceed from left to right, AY and AX 
have opposite signs, and the slope is 
negative. If the line is horizontal as 
we proceed from left to right, 7 2 = 7i, 
and hence the slope equals zero. 

Thus in the figure we have three 
lines through the point R. The slope 
of AB is positive; the slope of CD 
is negative; the slope of EF is zero. 

In solving Exercise 11 on page 205, the student probably assumed 
that if two segments P\P 2 and P 2 P S have a point P 2 in common, and 
the same slope, the three points P h P 2) and P s are in the same straight 
line. This theorem and its converse are characteristic properties of a 
straight line. 

58. THE EQUATION OF A STRAIGHT LINE 

In elementary algebra the student has drawn graphs of certain 
given equations. Our problem now is to find the equation when the 
graph is given. That is, we must express in some algebraic way the 
relation between X and Y of any point on the line. 

For example, if a point is anywhere on the X-axis, the 7-coordinate 
is always zero. We express this simply by the equation: 

7 = 0 

This equation is therefore the equation of the X-axis. 

Similarly, the equation of the 7-axis is: 

X = 0 

What is the equation of a line parallel to the X-axis and two umts 
above it? 

What is the equation of a line parallel to the 7-axis and two units 
to the right of it? 

Again, if a line bisects the first and third quadrants, evidently 

7 = X 


Figure 22 






208 


LINEAR TRENDS 


Figure 26 



By definition: 

the slope = ^ * 


3 


or F - 1 = 3(X - 2) 
and finally 

Y = 3X - 5 


What is the X-intercept of the line? 
the F-intercept? 

In general, let Pi(X u Fi) be the 
fixed point, and m the given slope. 

As before, let P(X, Y) be any other 
point on the line. Then, by definition : 


the slope = 


F — Fj 

X - X, 


= m 


or Y — Fi = m(X - Xi) (3) 

The equation (3) is called the point- 
slope equation of the straight line. Of 
course (2) is a special case of (3). 

The point-slope form is very useful 
in finding the equation of a line when 
two points on the line are given. We 
can determine the slope by equation 
(1), then we may use equation (3) with 
either of the given points as the point 
Pi(X x , F i). 

Thus, let us find the equation of the 
line through the points (2, 1) and 
(6, 4). 

Here we have: 


the slope = m = 


4-1 

6-2 


3 

4 


Now using equation (3), we have either: 

a. F - 1 = f(X - 2) 

b. F - 4 = f(X - 6) 


In either case, we obtain: 

3X — 4F = 2 

What are the X- and F-intercepts of this line? 



THE EQUATION OF A STRAIGHT LINE 


209 


EXERCISES 

1. Construct the line through (0, 2) with m *= 3, and find its equation. 

2 . Construct the line through (0, — 2) with m = 3, and find its equation. 

3. Construct the line through (0, 2) with m — — 3, and find its equation. 

4 . Construct the line through (0, — 2) with m — — 3, and find its equa- 
tion. 

6. Determine the type form of each of the following equations. Name 
the two conditions given. Use that knowledge in drawing the graph. 


a. 

Y = 3X - 4 

d. 

Y = %X 

b. 

V - 3 = 2(X - 5) 

e. 

Y - 2 = 3(X + 4) 

c. 

Y = 2X 

f. 

Y = X + 5 


6. State the equations of the straight lines: 

a. Through (2, 3) with slope 5 

b. Through (0, 5) with slope f 

c. Through (6, 2) with slope — 1 


Figure 29 


7. Show that AX + BY + C — 0, (5^ 0), is the equation of a 
straight line. 

8 . A straight line passes through the points (3, 5) and (8, 12). Find its 
equation and its X- and F-intercepts. 

9 . Prove that if two non vertical lines are parallel, their slopes are equal. 
State and prove the converse. 

10. Prove that if two lines are perpendicular to each other, the product 
of their slopes is — 1. State and prove the converse. 

Let the two lines intersect at C . 
Lay off CAi = CA 2 , and draw the 
parallels to the axes as shown in the 
figure. Then: 

angle a x = angle a 2 

(the sides being perpendicular each 
to each). 

Hence the triangles CA\B X and 
CA 2 B 2 are congruent (why?) and 

CB\ = B 2 A 2 
CB 2 = - BxAx. 

(For CB* is positive and B\A\ is 
negative.) 

BiAi 
CBi 



mi — 



210 


LINEAR TRENDS 


and the slope of line (2) is : 


m2 


B 2 A 2 

CB 2 


Therefore: 

mim 2 


BiA x \ /R2A2) = / giA A / CBi \ 

,cbJ\cb 2 ) [cbJK-b^J 


1 


The proof of the converse is left as an exercise for the student. 

11. Show that Y = 2X — 2, Y = 2X, and y = 2X + 4 are parallel 
lines. 

12. Show that the lines 3X + 27 = 6 and — 2X + 37 = 6 are per- 
pendicular. 

13. Find the equation of a line through (2, 5) parallel to 7 = SX + 2. 

14. Find the equation of a line through (2, 3) and perpendicular to 
3Z - 47 = 8. 

16. Are the points (2, 7), (5, 13), (9, 21), (15, 33) on a straight line? 
16. Are the points (1, 5), (3, 10), (5, 13), (7, 16) on a straight line? 


59. FITTING A STRAIGHT LINE TO OBSERVED DATA 


A. The Method of Least Squares. Many observed data, when 
plotted, give a set of points that seem to lie somewhat closely upon 
a curve. (As an illustration, see the data of automobile fatalities 
on page 103.) This suggests to us that the data may approxi- 
mately follow some simple mathe- 
matical law. It is not necessary 
that any of the points lie upon the 
curve selected to describe the data, 
but they will likely be distributed 
above and below the curve as the 
figure indicates. 

Let P 1 , P 2 , Pzy Pij etc. be several 
points determined by the data. The 
curve indicating their general trend 
is called an empirical curve. The difference between the ordinate of a 
given point and the ordinate of the corresponding point on an em- 
pirical curve is called the Y -residual of that point. That is: 


Figure 30 



p n (read: rho enn) = 


any 


7-residual = 


[" ordinate ofl _ [“ordinate of correspond-1 
L given point J L point of curve J 



FITTING A STRAIGHT LINE 


211 

Thus the 7-residuals of the points P h P 2 , P 3 , P 4 , are respectively 

PiQij P2Q2) P zQh P4Q4. 

Let us consider now the following data, which were derived in an 
experiment in a physics laboratory in connection with the problem 
of finding the relation between the resistance in ohms of a certain 
coil of wire and its temperature, the temperature to be kept between 
10° and 100° C. 

Figure 31 



Table 42 


t 

R 

10.5 

10.42 

29.5 

10.94 

42.7 

11.32 

60.0 

11.80 

75.5 

12.24 

91.1 

12.67 


When these points are plotted with t as the independent variable 
and R the dependent variable they lie close to a straight line. (It 
can be shown by the method of the preceding section that the points 
do not lie upon a straight line.) Allowing for errors of observation, 
we may assume that the law connecting resistance and temperature 
is linear, and our problem now is to determine the equation of the 
straight line which will best fit the given data. 

What is to be regarded as a best fit will depend upon the precise 
way that we choose to define the term best. While there is no unique 
answer to the question, we shall define the best straight line in accord- 
ance with what is called the principle of least squares . For the 
straight line we shall state as follows the principle of least squares: 
The straight line best fitting a set of points is that one in which the 
constants are so determined that they will make the sum of the squares 
of the residuals a minimum } 

Before we undertake to apply this principle to determine the 
equation of a straight line best fitting a set of data, let us examine 

1 For other methods of fitting a straight line, see Section 81. 



212 


LINEAR TRENDS 


some observed data to which several straight lines have been fitted 
and, adopting the principle of least squares as a criterion for the 
goodness of fit, note that we can determine which of several lines 
is the best. 


Figure 32 



Table 43 


X 

Observed 

Y 

Computed 

Y = 2X + 3 

1 

4 

5 

2 

8 

7 

3 

9 

9 

4 

10 

11 

5 

14 

13 


Consider the observed data given by the first two columns of 
Table 43. The five points constituting the observed data are plotted 
on Figure 32. On this set of axes is drawn the line F = 2X + 3. 
This line has the slope of 2, the F-intercept of 3, and it passes through 
the point (3, 9), one of the observed points. Two of the observed 
points are above the line, two are below the line, and one is on the 
line. Judging by the graph, the line is a reasonable fit. That is, 
corresponding to the given values of X the computed values of F, 
5, 7, 9, 11, 13 are reasonably near the corresponding observed values 
of F, 4, 8, 9, 10, 14. Stated differently, for the given values of X, 
the values of F computed from F = 2X + 3 are reasonably close 
approximations to the observed values of F. 

Just how near are the observed points, as a group , to the line 
F = 2X + 3? Let us answer this question by applying the principle 
of least squares to the residuals (see end of page 211). The results 
are given in Table 44. 

We note that, based upon the line F = 2X + 3, 

2p = 0 and 2 p 2 = 4. 

Suppose that we now consider the line F = 2.2X + 2.4 with the 
observed values given in Table 45. If the student will plot the 
observed points and the line F = 2.2X + 2.4 on the same axes, he 
will observe that this line also passes among the points and that two 



FITTING A STRAIGHT LINE 


213 


of the observed points are above the line, two are below the line, and 
the line Y = 2.2X + 2.4 passes through the observed point (3, 9). 


Table 44 


X 

Observed 

Y 

Computed 

Y = 2X + 3 

Y -Residuals 

P 

(Y-Residual !«)* 

P* 

1 

4 

5 

- 1 

1 

2 

8 

7 

+ 1 

1 

3 

9 

9 

0 

0 

4 

10 

11 

- 1 

1 

5 

14 

13 

+ 1 

1 




O 

II 

*t> 

4 = 2p J 


Thus we may say that the line Y = 2.2X + 2.4 also fits the data 
reasonably close. Just how closely does this line fit the data? Again 
we find the sum of the squares of the residuals by preparing Table 45. 


Table 45 


X 

Observed 

Y 

Computed 

Y « 2.2X + 2.4 

Y-Residuals 

( Y- Residuals) 2 

1 

4 

4.6 

- 0.6 

0.36 

2 

8 

6.8 

+ 1.2 

1.44 

3 

9 

9.0 

0.0 

0.00 

4 

10 

11.2 

- 1.2 

1.44 

5 

14 

13.4 

+ 0.6 

0.36 


! 


0.0 = 2p 

3.60 = 2p 2 


If the algebraical sum of the residuals, 2p, is adopted as a criterion 
for the goodness of fit, of the two lines we have considered one fits 
as well as the other since for each line 2p = 0. However, if we 
adopt 2(F-residuals) 2 , 2p 2 , as the criterion, then the fine Y = 2.2X 
+ 2.4 fits more closely than the line Y = 2X + 3. As a matter of 
fact we shall soon have the student show that, based upon the prin- 
ciple of least squares, the line Y = 2.2X + 2.4 is the best fitting line 
to the observed data of Table 44. 

We shall now proceed to the main problem of the section: adopting 
as a criterion of the goodness of fit the principle of least squares, to 
find the values of m and b in order that the line Y = mX + b may 
best fit a swarm of points. We shall approach the general problem 



214 


LINEAR TRENDS 


by considering first a simple set of data, namely, that given in 
Exercise 18 on page 210. We have the four points as shown in 
Figure 33. 



These data evidently have a straight-line trend. Assume that they 
can be represented by 

F = mX + 6 

for proper values of m and b. Corresponding to X = 1, the ordinate 
of the line is m • 1 + b; corresponding to X = 3, the ordinate of the 
line is m • 3 + 6, and so on. Hence, from the definition: 

The 1st F-residual = pi = 5 — (m + 6) ~ 5 ~ m — b 

The 2nd F-residual = p 2 = 10 — (3m + b) = 10 — 3m — b 

The 3rd F-residual = p 3 = 13 — (5m + b) = 13 - 5m — b 

The 4th F-residual = p 4 =* 16 — (7m + b) - 16 — 7m — b 

The values of m and b must be so chosen that the sum of the squares 
of the F-residuals is a minimum. The sum of the squares of the F- 
residuals is given by: 

Sp 2 = (5 — m — 6) 2 + (10 -3m- 6) 2 + (13 - 5m - 6) 2 + (16 - 7m - 6) 2 

This result may be written either as a quadratic in b or as a quad- 
ratic in m. We have then: 

a. 2p 2 = 46 2 + (32m - 88)6 + (84m 2 - 424m + 550) 

b. 2p 2 * 84m 2 + (326 - 424)m + (46 2 - 886 + 550) 



FITTING A STRAIGHT LINE 


215 


Recalling the theorem of Section 26 (p. 83) to the effect that 
Y * aX 2 + bX + C is a minimum when X = -t—’ we note that 2 p 2 
in a. is a minimum when: 

b - ~ (32 " ~ 88) 4m + 11 

or 4m + 6 = 11 

and 2p 2 in b. is a minimum when: 

- (32b - 424) - 46 + 53 


m = 


21 


or 


These equations 


168 

21m + 46 - 53 

4m + 6-11 
21m + 46-53 

are called normal equations. If they are solved simultaneously, we 
obtain m = i.g 

b = 3.8 

and hence, by the method of least squares, the best-fitting straight 


line is: 


F = 1.8X + 3.8 


If we give to X in this equation the values 1, 3, 5, 7, we obtain the 
corresponding computed or most probable values of Y. Thus: 

If X = 1, F = 1.8 + 3.8 - 5.6 
If X = 3, F = 5.4 + 3.8 = 9.2 

Note that if X = 4, F = 11. That is, the point (M x , M y ) is on the 
line. 

In the following table 

any F-residual — F observed — F computed 


Table 47. Observed and Computed Values of Y Compared 
by Means of Their T-Residuals 


X 

F Observed 

F Computed 

Y -Residuals 

(F -Residuals) 2 

1 

5 

5.6 

- 0.6 

.36 

3 

1 10 

9.2 

0.8 

.64 

5 

13 

12.8 

0.2 

.04 

7 

16 

16.4 

- 0.4 

.16 

i 

tl 

My = 11 


0.0 

1.20 



216 


LINEAR TRENDS 


EXERCISE 

Find the equation of the straight line that best fits the data below. Then 
find the computed values of Y and compare them with the observed values. 


Table 48 


X 

Y 

1 

1.1 

3 

6.8 

5 

12.6 

7 

19.0 


Let us now generalize the procedure by fitting the line 

Y = mX + b 

to the data that consist of n sets of values which are given in the table. 1 



We assume that the points have a linear trend, as shown in Fig- 
ure 34. Our problem is to determine the values of m and b for the 
best-fitting line. 

Corresponding to X =* Xi, the ordinate of the line is mX\ + 6; 

1 The student should note especially that n is the number of pairs of values of 
Y and X . 



FITTING A STRAIGHT LINE 217 

corresponding to X = X 2 , the ordinate of the line is ml 2 + 6 , etc. 
Hence we have: 

Pi = 1st 7-residual = Fi - (ml i + 6) = (Fi — mX i — 6) 
p 2 = 2nd F-residual = F 2 — (ml 2 + 6) = (F 2 — ml 2 — 6) 


p n = nth F-residual = F n — (ml n + 6) = (F n — ml n — 6) 

The values of m and 6 must be so determined that the sum of the 
squares of the F-residuals is a minimum. We will therefore square 
each residual and find the sum. We have 

p] = Y\ + m 2 X\ + 6 2 - 2mX\Yi - 2 bY 1 + 2bmX l 
pi = Y\ + m?X\ + 6 2 - 2 mI 2 F 2 - 26 F 2 + 26mX 2 


pi « Fn + m 2 X 2 + 6 2 - 2mX n Y n - 26 F n + 2fowX n 

Adding the above equations, we express 2p 2 as a quadratic in 6 
and as a quadratic in m. Using the sigma notation, after careful 
rearrangement of terms, we obtain : 

a. 2p 2 = n6 2 + 2 [m2X - 2F]6 + [m 2 2X 2 - 2m2XF + 2F 2 ] 

b. 2p 2 = m 2 2X 2 + 2[62X - 2XF]m + [2F 2 - 262 F + n6 2 ] 

From equation a., 2 p 2 is a minimum when 

— 2[w2X — 2F] - m2X + 2F 

b 2 n n 

or m2X + n6 = 2F 

and from equation b., 2p 2 is a minimum when: 

- 2[62X - 2XF] _ - 62X + 2XF 

m 22X 2 2X 2 

or 

mSX 2 + 62X = 2X7 


These equations 


m2X + nb = 27 
7n2X* + 62 X = 2X7 


( 4 ) 



218 


LINEAR TRENDS 


are the normal equations. 1 Note that the first can be written by 
summing the equation Y = mX + b , and that the second can be 
written by multiplying Y = mX + b by X and summing the result. 

Solving the normal equations simultaneously, we have: 

ti2*7-2*27 \ 

m ~ TiZX* - (2*) 2 I 

2 *»2F - 2*2AT [ ^ J 

” n2A? - (2X) 2 
and for the best-fitting straight line 

/n2*7 - 2*2 Y\ 2**2F - 2*2 XY N 

[nZX 2 - ( 2*) 2 / + 712* 2 - (2*) 2 

The line given by (6) is sometimes called the line of regression 2 of 
YonX. 

Let us use the following tabular arrangement to compute the 
coefficients m and b in (5) and to compare the computed values of 
Y with the observed values. 


Table 50 


* 

Observed Y 

X 2 

XY 

Computed Y 

Y-Residuals 

( Y-Residuals ) 2 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 








IX 

27 

2X 2 

2*7 





1 The student familiar with the calculus would derive these equations much 
more quickly. Thus, if 

Pi - Yi — ( mXi + b) - the fth 7-residual, 
then: 2/cJ « 2(7* - mXi - 6) 2 

The values of m and 6 for which 2p 2 is a minimum are obtained by setting the 
partial derivatives of 2p 2 with respect to m and b each equal to zero. We then 

° btam: mIX + nb = 1Y 

m2X* + bIX = IX V 


2 The line of regression of * on Y may be obtained by minimizing the sum of 
the squares of the *-residuals of the line * = mY + 6. The properties of this 
line will be summarized in Section 66 (p. 248). 




FITTING A STRAIGHT LINE 219 

In connection with Table 50, the following procedure is recom- 
mended for numerical problems 1 : 

1. Compute the values for columns (3) and (4). 

2. Find 2X, 2F, 2X 2 , and XX Y. 

3. Substitute in ( 4 ), solve for m and 6, and obtain the equation of the 
best-fitting straight line. 

4. Substitute the values of X from column (1) into the equation of the 
straight line and obtain the computed or most probable values of F. 
This completes column (5). 

5. Complete columns (6) and (7) and thus find Xp and Xp 2 . 


EXERCISE 

Apply the foregoing suggestions to the data of Table 46 and Table 48. 

B. The Method of Moments. Another very widely used method 
for fitting a theoretical curve to observed data is the method of 
moments. 

Let (Xi,Fi), (X 2 ,F 2 ), . . (X n ,F n ) be the n points determined 
by n sets of observed data. If the selected curve is denoted by 
Y = /(X), the theoretical values of Y are /(Xi), /(X 2 ), . . ., f(X n ). 
The principle of moments (see Section 42, p. 159) says that we shall 
obtain a good fit if the 2th moment about OF, (2 = 0, 1, 2, . . ., 
k — 1), of the n observed values of F equals the corresponding 2th 
moment about OF of the n theoretical values of F, k being the 
number of undetermined constants in the given equation. That is, 
for 2 = 0 we have the zeroth moments: 


2 observed F — 2 theoretical Y 


or 


SF» = 2/(X<) 

i— 1 i = l 

For 2 = 1 we have the first moments : 

2 XiYi = 2 Xif(Xi) 

i=l i=i 

For 2 = 2 we have the second moments: 

2X\Yi = 2X 2 J(X0 
and so on. 1=1 


Equations ( 5 ) and ( 6 ) are useful for theoretical problems whereas equations 
( 4 ) are better for numerical problems. 



220 


LINEAR TRENDS 


Figure 35 



When the curve Y = f(X ) is a straight line Y = mX + b , we have 

2 Y = 2(ml + b) = mSX + nb 
and 2XF = 2X(mX + b) = m2X 2 + &2X 

which are the same equations as (4). 1 Evidently the suggestions 
following Table 50 apply here. 


EXERCISES 

1. Find the equation of the straight line whit 'i best fits the temperature 
resistance data of Table 42 (p. 211). 

2. The lengths, l , attained by a certain coiled spring made of steel wire, 
corresponding to different weights, w, supported by the spring were as 
shown in the following table. The lengths were measured in centimeters 
and the weight in grams. Find the linear relation in the form l = mw + b . 

1 It can be shown (see “The Method of Moments’ ’ by Dunham Jackson, 
American Mathematical Monthly , September, 1923) that if f(X) is a polynomial, 

the method of moments gives the same solution as the method of least squares. 



FITTING A STRAIGHT LINE 


221 


Lengths and Weights of Spring 


w 

l 

w 

l 

100 

92.2 

400 

98.3 

200 

94.3 

500 

100.2 

300 

96.2 

600 

102.3 


3. Compute the length of the spring in the table of Exercise 2 for all 
weights at intervals of 50 grams from 50 grams to 650 grams. 

4. Show that the point (M x , M Y ) is on the line (6), that is, show that the 
best-fitting line passes through the centroid of the points. 

6. Using the values of m and b given in equation (5), show that the sum 
of the F-residuals for Y = mX + b is equal to zero. 


60. THE STRAIGHT LINE WITH THE ORIGIN 
AT THE CENTROIDAL POINT 


Figure 36 




The theorem contained in Exer- 
cise 4 above states that the best- 
fitting straight line passes through 
the centroidal point ( Mx , M y). Using 
this point as origin, the equation of 
the line takes a much simpler form 
and our further mathematical treat- 
ment is greatly simplified. 

Denote the centroidal point by C. 

If X , Y is any pair of numbers 
referred to zero as origin, their values 
referred to C as origin are given by: 


x = X - M x 1 
y = Y - M y j 


( 7 ) 


The equation of the line referred 
to the new origin, C f is of the form 


y — mx 

since the y-intercept is zero. 


The tabulated data now take the following form: 



222 


LINEAR TRENDS 


Table 51 


X 

Y 

hi 

11 

X 

1 

s 

11 

1 

X, 

Y , 

Xi 

yi 

X 2 

y 2 

x 2 

yz 

Xn 

Yn 

Xn 

yn 


Corresponding to x = x% 9 the ordinate of the line is mx i; corre- 
sponding to x = x 2 , the ordinate of the line is mx 2 , etc. Hence: 

pi = 1st ^/-residual = y i — mx\ 
p 2 = 2nd i/-residual = y 2 — mx 2 


p n * nth ^/-residual =» y n - mx n 


We wish to determine m so that the sum of the squares of the y- 
residuals is a minimum. Evidently : 

2 p* = 2 ( 3 n - mxi ) 2 = m 2 2x 2 - 2m2x I i/ l + 2y? (8) 

The value of m for which 2 p? is a minimum is, omitting subscripts: 



and the best-fitting line, referred to the axes through C, is: 


y 



X 


(9) 

( 10 ) 


If we replace x and ?/ by their values in (7), we obtain the equation 
of the line referred to axes through 0(0, 0) as origin, namely: 


Y - Mr = 


2xy 

2x 2 


(X - 


( 11 ) 


We shall illustrate the procedure to be followed should one decide 
to fit a least-squares line by the x, y method. We shall use data 
that we have previously considered by the X , Y method. After 
computing M x = 4 and M Y = 11, we complete the table (see 
Table 52) by finding the quantities suggested by (9). The algebraic 



LINE WITH ORIGIN AT CENTROIDAL POINT 22& 


work follows the table, and we obtain of course the same equation 
that we found on page 215. 


Table 52 


X 

Y 

1 

X 

u 

y = K — 11 

xy 

X 2 

1 

5 

- 3 

- 6 

18 

9 

3 

10 

- 1 

- 1 

1 

1 

5 

13 

1 

2 

2 

1 

7 

16 

3 

5 

15 

9 

II 

>• 

My = 11 



36 

20 


or 


m = 


£- 1.8 


- ^ x v _ 36 
“ 2x 2 ~ 20 

y = 1.8x (Equation of line through C as origin) 

Y - 11 - 1.8(X - 4) 

Y * 1.8X + 3.8 (Equation of line referred to 

through 0 as origin) 


axes 


We have thus developed two methods of finding the equations of 
the least-squares line determined by a set of data. We may de- 
termine ra and b for the line Y = raX + b by using the normal 
equations (4) with the X, Y data, or we may determine ra for the 
line y = rax, where x and y are the deviations of X and Y from their 
respective means: x = X — Mx y y = Y — My) then replacing x 
and y by their values we obtain the X, Y equation. 

The second method is preferred for numerical problems when the 
values of x and y are such that the arithmetical operations upon 
them are simpler than when X and Y are used. Thus, if the X, Y 
data are integral and Mx and My are integral, the values of x and y 
will be integral and then the table for finding ra is decidedly simple 
to construct. If Mx and My are decimals and the values of x and y 
are decimals, the second method is to be discouraged. 

However for theoretical purposes the results of the second, or 
x, y , method are important and the contents of Section 60 should 
be mastered. 

Let us consider the data of Table 53. We wish to find the X, Y 
equation for these data. 



224 


LINEAR TRENDS 


Table 53. The Index Numbers of Retail Prices of 10 Articles 
of Food at Two Different Years 


Article 

1st year 

X 

2nd year 

Y 

X^ 

XY 

X 

y 

X* 

xy 

1 

88 

82 

7,744 

7,216 

4 

5 

16 

20 

2 

77 

71 

5,929 

5,467 

- 7 

- 6 

49 

42 

3 

91 

82 

8,281 

7,462 

7 

5 

49 

35 

4 

75 

70 

5,625 

5,250 

- 9 

- 7 

81 

63 

5 

95 

87 

9,025 

8,265 

11 

10 

121 

110 

6 

83 

77 

6,889 

6,391 

- 1 

0 

1 

0 

7 

85 

77 

7,225 

6,545 

1 

0 

1 

0 

8 

82 

77 

6,724 

6,314 

- 2 

0 

4 

0 

9 

84 

73 

7,056 

6,132 

0 

- 4 

0 

0 

10 

80 

74 

6,400 

5,920 

- 4 

- 3 

16 

12 

Total 

840 

Mx = 84 

770 

My = 77 

70,898 

64,962 

0 

0 

338 

282 


Using the X, Y values with 
Y = mX + 6 
the normal equations are 

840m + 105 = 770 
70,898m + 8406 = 64,962 

Eliminating 6 we obtain 
70,560m + 8406 = 64,680 
70,898m + 8406 = 64,962 
338m = 282 

m = 0.83 

Substituting we find 

6 = 7.28 

and our X, Y equation 

Y = 0.83X + 7.28 


Using the x ) y values with 
y = rnx 

the normal equation is 

__ 'Zxy __ 282 
m ~ Sx* ~ 338 
m = 0.83 

Our x ) y equation is 
y = 0.83x 

and our X, Y equation is 

Y - 77 = 0.83(X - 84) 
or, simplifying, 

Y = 0.83X + 7.28 


Obviously the x, y method leads to the solution more simply than 
the X, Y method. 

The student who has been impressed with the power of the x f 
method when computing M, a, a h and a 4 will naturally wonder if 
this method cannot be employed to advantage in this work of fitting 
straight lines to data. We assure him that the method is an excellent 
one and we shall present it in the next chapter. 



LINE WITH ORIGIN AT CENTROIDAL POINT 225 


EXERCISES 

1. Find the X f Y equation of the least-squares line for each of the fol- 
lowing sets of data: 


(a) 


X 

Y 

1 

2 

8 

4 

14 

8 

15 

9 

22 

12 


(b) 


X 

Y 

5 

12 

7 

15 

11 

26 

13 

33 

14 

34 


(c) 


X 

Y 

10 

4 

8 

5 

6 

7 

4 

8 

2 

11 


(d) 


X 

Y 

2 

47 

4 

43 

6 

41 

10 

37 

13 

31 

15 

26 

20 

20 


2. In the following table S is the weight of potassium bromide which 
will dissolve in 100 grams of water at T° C. Find the relation: S = mT 
+ 6. Use this equation to estimate S when T = 50°. 


T 

0 

20 

40 

60 

80 

s 

54 

65 

75 

85 

96 


3. In the following table 

X — scores of ten students on a standardized test in secondary algebra 
taken at the beginning of college 
Y = semester grades of the same students in college algebra 
Find by two methods the least-squares line for these data. Based upon 
these data, estimate the semester grade of a student who made 60 on the 
standardized test. 


X 

Y 

X 

Y 

54 

67 

90 

91 

56 

68 

63 

74 

64 

74 

47 

52 

33 

48 

92 

90 

57 

69 

34 

47 


4 . A biologist found that the length (in centimeters) of intestines of 
birds and their weight (in grams) were linearly related. Find the relation 
W — rnL + b for the data given in the following table. 



226 


LINEAR TRENDS 


Average 
Length of 
Intestines 
L 


4.3 
5.8 
6.5 

7.3 

8.4 
9.0 


Average 
Weight of 
Bird 
W 


1.5 
2.7 

3.6 
4.2 
5.4 
5.9 


Average 
Length of 
Intestines 
L 


9.7 

10.2 

11.0 

11.6 

12.4 

12.6 


Average 
Weight of 
Bird 
W 


6.5 

7.3 

8.1 

8.8 

9.7 

9.8 


6. The latent heat, L, of steam in calories is given for various values of 
the temperature, T. Find the equation of the best-fitting line for L in 
terms of T 7 . 


T 

L 

T 

L 

70 

556 

110 

530 

80 

550 

120 

523 

90 

542 

130 

515 

100 

536 




What is the value of L when T = 75? 

Compare the computed and the observed values of L for the given values 

of T. 


61. FITTING A STRAIGHT LINE TO A TIME SERIES 

In Section 17 (p. 43) we encountered series in which time is the 
independent variable. Several time series were tabulated in Tables 
10, 11, 12 (pp. 44-46), and their graphical representations were ex- 
hibited in Charts 4, 5, and 6. Further attention to time series has 
been reserved for this chapter because, after the graphical representa- 
tion, the next step in the analysis is the determination of the long-time 
trend, frequently called the secular trendy and this is usually accom- 
plished by fitting a straight line to the data. The straight line, of 
course, should be fitted only to those series which, over a long period, 
show a general movement in one direction, that is, a general tendency 
to increase or to decrease. 

Over a considerable period of time, many social and economic 
phenomena show a definite tendency to grow or to decline, that is, 
they show a definite trend. For example, the population of the 



FITTING A STRAIGHT LINE TO A TIME SERIES 227 


United States (see Table 10) shows a definite tendency to increase, 
while the percentage decade rate of growth is constantly declining. 
The production of lumber in the United States (see Table 11) from 
1909 to 1922 shows a definite tendency to decline. While the secular 
trend is usually described by means of a linear function, it must not 
be supposed that all definite trends are so simply described. Popula- 
tion data, for example, frequently require curves with rather complex 
equations for their description. 

The fact should be emphasized that the secular trend is concerned 
with the regular , long-term movements. True, over a short period of 
time the movements may vary spasmodically, but the general trend 
is upward or downward. We are not concerned here with the sea- 
sonal variations that are so characteristic of time series, but with the 
secular trends, and only those that can be described by linear func- 
tions. 

The computed or trend value of Y at any date is taken as the 
normal value at that date. It is viewed as the value that would obtain 
if all temporary and accidental forces were eliminated. The equation 


Table 54. The Production of Lumber in the United States: 
Computing the Secular Trend 1 


Year 

X 

Production in 
Board Feet 
(billions) 

Y 

X 2 

AT 

Computed 

Y 

P 

1909 

- 6 

44.5 

36 

- 267.0 

42.1 

2.4 

1910 

- 5 

40.0 

25 

- 200.0 

41.2 

- 1.2 

1911 

- 4 

37.0 

16 

- 148.0 

40.3 

- 3.3 

1912 

- 3 

39.2 

9 

- 117.6 

39.4 

- 0.2 

1913 

- 2 

38.4 

4 

- 76.8 

38.5 

- 0.1 

1914 

- 1 

37.3 

1 

- 37.3 

37.6 

- 0.3 

1915 

0 

37.0 

0 

000.0 

36.7 

0.3 

1916 

1 

39.9 

1 

39.9 

35.8 

4.1 

1917 

2 

35.8 

4 

71.6 

34.9 

0.9 

1918 

3 

31.9 

9 

95.7 

34.0 

- 2.1 

1919 

4 

34.6 

16 

138.4 

33.1 

1.5 

1920 

5 

33.8 

25 

169.0 

32.2 

1.6 

1921 

6 

27.0 

36 

162.0 

31.3 

- 4.3 

1922 

7 

31.6 

49 

221.2 

30.5 

1.1 

Total i 

7 

508.0 

231 

51.1 


0.4 


1 The data are taken from Statistical Abstract of the United States, 1928, p. 689. 



228 


LINEAR TRENDS 


of the trend line from which trend values are computed is merely a 
summarizing expression for the large group of data upon which it is 
based, and therefore may be used for making estimates of values 
within the period but may not be at all applicable for making fore- 
casts and predictions. Some new factor may enter at any time and 
disturb the trend. Therefore, when a trend line is used for extra- 
polation — that is, for computing values outside the given abscissal 
range — the implications of the line beyond the period of record 
should be carefully checked against every possible evidence that may 
influence the factor in question. 

The method of fitting a straight line to a time series is illustrated in 
Table 54. Our problem here is to find the equation of the trend line 
for the production of lumber in the United States, the data for which 
were given in Table 11, and graphically presented in Chart 5. 

While the origin for X may be chosen at any point, for the sake 
of simple computation it should be taken at or near the center. If 
an odd number of years is under consideration, it should be taken at 
the middle year of the period. If X = 0 at 1915, we have 

n * 14 2X 2 = 231 

2Z = 7 2 XY = 51.1 

2 Y = 508 

Using formulas (5), we find m and b : 


14(51.1) - 7(508) 

- 0.892 

m 14(231) - 49 

231(508) - 7(51.1) 

36.73 

° 14(231) - 49 


The equation of the straight line which gives the secular trend is 
therefore 

Y = - 0.892X + 36.73 

from which the computed values and the residuals can be found. 

Other methods for treating time series will be found in Sections 81 
and 87. While the methods we shall present in these later sections* 
make possible the determination of the constants m and b with less 
arithmetical tedium, we shall present no method that surpasses in 
precision and reliability that based upon the principle of least squares. 
In addition to the three important properties (Can you name them?) 
to which we have referred — casually perhaps — the least-squares 



FITTING A STRAIGHT LINE TO A TIME SERIES 229 


line has the enthusiastic approval of the theory of probability. 
We cannot say so much for any other line. 

EXERCISES 

1. The following table gives the annual production of Portland Cement 
in the United States. [ Statistical Abstract of the United Siates t 1930, p. 785.] 
Find the least-squares line for these data. 


Year 

Production 
{millions of barrels) 

Year 

Production 
{millions of barrels) 

1910 

77 

1920 

100 

1911 

79 

1921 

99 

1912 

82 

1922 

115 

1913 

92 

1923 

137 

1914 

88 

1924 

149 

1915 

86 

1925 

161 

1916 

92 

1926 

165 

1917 

93 

1927 

173 

1918 

71 

1928 

176 

1919 

81 

1929 

171 


2. In the following table F gives the average weekly earnings of shop 
and office employees in representative New York State factories. 


Fear 

X 

Y 

Year 

X 

F 

1914 

- 3 

$12.48 

1918 

1 

$20.35 

1915 

- 2 

12.85 

1919 

2 

23.50 

1916 

- 1 

14.43 

1920 

3 

28.15 

1917 

0 

16.37 

1921 

4 



(1) Find the equation of the least-squares line for these data. 

(2) Based upon this line what were the predicted average weekly earn- 
ings in 1921? The actual average weekly earnings were $25.72. 

3. In the following table Y gives (in millions of dollars) the net earnings 
»*f the Associated Gas and Electric System, 1920-1928. 


Year 

X 

F 

Year 

X 

F 

1920 

- 4 

13.4 

1925 

1 

29.5 

1921 

- 3 

16.2 

1926 

2 

33.5 

1922 

- 2 

19.2 

1927 

3 

37.8 

1923 

- 1 

22.7 

1928 

4 

40.6 

1924 

0 

25.1 

1929 

5 




230 


LINEAR TRENDS 


(1) Find the equation of the least-squares line for these data. 

(2) Based upon this line, what were the predicted earnings for 1929? 
The actual net earnings were 48.5 millions. 

4 . The number of mules on farms in the United States for the given 
years is shown in the following table. Choose X = 0 at 1932 and find the 
least-squares line for the data. 


Year 

Mules 
( millions ) 

Year 

Mules 

(millions) 

1926 

5.9 

1933 

5.0 

1927 

5.8 

1934 

4.9 

1928 

5.7 

1935 

4.8 

1929 

5.5 

1936 

4.7 

1930 

5.4 

1937 

4.6 

1931 

5.3 

1938 

4.4 

1932 

5.1 

1939 



REVIEW EXERCISES 


1. Use the relations X = x + M x , F = y + M y with the value of 
m given by (5) page 218 and thus obtain the value of m given by (9) 
page 222. 

2. State the three most important properties of the least-squares line 
fitting a swarm of points. 

3. Given a set of variates, what is the algebraical sum of the deviations 
of these variates from their M x ? 

4 . Given a set of variates, from what value is the sum of the squares 
of the deviations least? 

5. Given a set of variates distributed normally, what per cent of the 
variates lie within the interval M x =fc <Ly? within the interval M x =t 2<x x ? 
within the interval M x =b 3<r x ? 

6. When is it advisable to use the method of Section 60 to find the 
equation of the least-squares line? 

7 . What is the unit of measurement of a F-residual? of a (F-residual) 2 ? 
of 2( F-residual) 2 or 2p 2 ? 

8. Find 2 p 2 for the data on lumber production, Table 54, including 
the unit of measurement. 

9 . Do you think 2 p 2 /n can be used to measure the goodness of fit 
of a curve fitted to a swarm of points? In what unit would it be expressed? 


10. What about 



as a measure of the goodness of fit? In what 


unit would it be expressed? 


Do you think 



. , Sp 2 

superior to — as a 
n 


measure of goodness of fit? Why? 



FITTING A STRAIGHT LINE TO A TIME SERIES 23] 

11 . (a) Show, for the data of lumber production, Table 54, that 
= 2.15 billions of board feet. 

(b) How many values of p in Table 54 are numerically less than 2.15? 

(c) How many values of p are numerically less than 2(2.15)? 

12. Can you think of any method whereby we may compare the close- 
ness of fit of straight lines fitted to data expressed in different units, say 
Tables 53 and 54? 




Chapter 8 

SIMPLE CORRELATION 

62. MEASURES OF CONCENTRATION OF POINTS 
ABOUT THE LINE OF REGRESSION 

In the preceding chapter we have devoted considerable attention 
to the problem of securing a linear mathematical expression for 
the relationship between two variables. This equation, called the 
line of regression of Y on X , expresses mathematically the average re- 
lationship between the variables. 1 In the exercises and the illustra- 
tive examples that we have considered, the points have clustered 
closely about the regression line. But a line of definite equation 
may be fitted to points that are quite scattered, widely dispersed 
with respect to the line. A question immediately presents itself: 
How can we measure the closeness with which the points cluster 
about the line? Can we find a measure of the degree of the relation- 
ship between the two variables? 

This problem is similar to that which arose in connection with the 
measures of central tendency. We desired to know how great was 
the concentration of the measures of a distribution about their mean. 
To measure this concentration we built up several measures of dis- 
persion, recommending especially the standard deviation, which is 
the square root of the mean of the squares of the deviations from the 
arithmetic mean. 

The line of regression possesses two important properties that are 
analogous to similar properties of the arithmetic mean. The arith- 
metic mean is the value such that the sum of deviations from it is 
zero (see Exercise 5, p. 68); the regression line enjoys the property 
that the sum of the residuals from it is zero (see Exercise 5 on p. 221). 
The arithmetic mean is the value such that the sum of the squares 
of the deviations from it is a minimum (see p. 131); the regression 
line enjoys the property that the sum of the squares of the residuals 
from it is a minimum (the principle of least squares). 

1 The line of regression of X on Y will be considered in Section 66 (p. 247). 

232 



MEASURES OF CONCENTRATION ABOUT LINE 233 


Owing to the fact that the line of regression possesses the two 
properties mentioned before, it is frequently called the line of means. 
This name will be more adequately justified in Section 67 (p. 254). 
Consequently, just as we used 

<rx = 

(where N is the total frequency) to measure the concentration of the 
observed X measures about their mean, so we use 

0 ) 

(where n is the number of pairs of values of X and Y and where 
pi = observed Y x — computed Y x ) to measure the concentration 
of the observed Y measures about their line of means. S v is called 
the standard error of estimate. One method of obtaining S v is illus- 
trated in Table 47 (p. 215) and Table 50 (p. 218). 

In Table 47 we have: 

2p 2 = 1.20 and n = 4 
therefore 

S v = = 0.54772 

EXERCISE 

Find S v for Exercises 1 and 2 on page 229. 

It is evident that the F-residuals and S y are expressed in the given 
F-unit. To interpret S v intelligently requires a knowledge of the 


Figure 38 




1 0 


X 



234 


SIMPLE CORRELATION 


properties of a normal surface. It is sufficient at this point to state 
that for a distribution of sufficient size to approximate the normal 
form, about two-thirds of the points will lie in a strip bounded by two 
lines on either side of and parallel to the regression line, RR , and a 
vertical distance, S y , from it. That is, the odds are 2 to 1 that, for 
a given X ) the observed Y will lie within the zone: 

(computed Y ) =fc S v 

Similarly, a zone established by drawing lines on either side of 
and parallel to RR and a vertical distance 2S V from it will include 
about 95 per cent of the points. That is, the odds are 95 to 5 or 
19 to 1 that, for a given X, the observed Y will lie within the zone: 
(computed Y) =fc 2 S y 

If the zone is further enlarged — say 3 S y vertically from RR above 
and below — it is practically certain (odds 385 to 1) that an ob- 
served Y will lie within the interval 

(computed Y) dtz 3 S y 

Let us illustrate these statements graphically. On Figure 39 we 
have plotted thirty points which represent graphically thirty ( X , Y) 
sets of observed data. To these data we have fitted the regression 
line RR. It will be noted that twenty of the points lie within the 
zone determined by the parallels to RR and ± S y from it. Twenty- 
eight are within the area determined by the parallels to RR and 
i fc 2 S y from it. Only two of the points are outside the latter area. 


Figure 39 




MEASURES OF CONCENTRATION ABOUT LINE 235 


Let us look at this matter somewhat differently. It is recalled 
that the line of regression may be used for purposes of estimating F 
for given values of X . (When X is within the given abscissal range, we 
estimate Y from the equation; when X is outside the given abscissal 
range, we predict Y from the equation.) Thus, the computed value of 
Y may be an estimate or a prediction. In the language of probability, 
given an X, the equation gives the best or most probable value of Y. 

When we use the regression equation to make estimates or predic- 
tions, we naturally are eager to know the degree of confidence to 
put in our results. Suppose we choose an X and compute F. The 
odds are 2 to 1 that the observed F will not differ numerically from 
the computed F by more than S y . Thus, for Table 54, we have 
F = - 0.892X + 36.73 


billions of board feet, and S y = 2.15 billions of board feet. Let 
X = 5. We find F = 32.3 billions of board feet. The odds are 
2 to 1 that this value does not differ from the observed F( = 33.8) 
by more than S y (= 2.15). That is the odds are 2 to 1 that the 
observed F is within the interval 32.3 ±2.15 billions of board feet. 

Of course if the student wishes to do so, he may use the probable 
error of estimate instead of the standard error as a measure of the 
reliability of his estimate. Since the probable error of any parameter 
is 0.6745 times the standard error of the parameter, we have 


Probable error 
of estimate 


0.6745 ( 


Standard error 
of estimate 


) = 0.6745iS„ 


There is obviously a consequent change of language. In this case 
the chances are even that for a given X the observed F will not differ 
from the estimated F by more than ± 0.6745£ y . 


EXERCISES 

1. Show that 2p 2 - 2[F - (mX + b)J = 2F 2 - bZY - mZXF. 

Hint: Make use of the normal equations (4), page 217. 

2. (a) Using Exercise 1 above, show that 

s . _ (10 

What sigma (2) function, not used in finding m and 6, is needed to find 
S v from formula (1')? 

(b) Prove: S y = <t p . 



236 SIMPLE CORRELATION 

3. The following data are taken from The World Almanac , 1935, pp. 292 
and 310. 

X = Savings Bank Deposits in the U.S., 1918-1933. 

Y — Number of Strikes and Lockouts in the U.S., 1918-1933. 


Year 

Sav. Bk. Dev. 
(billions of $’$) 
X 

S. and L.O. 
(thousands) 
Y 

1918 

5.4 

3.3 

1919 

5.9 

3.6 

1920 

6.5 

3.3 

1921 

6.8 

2.4 

1922 

7.1 

1.1 

1923 

7.7 

1.6 

1924 

8.2 

1.2 

1925 

8.9 

1.3 

1926 

9.3 

1.0 

1927 

9.5 

0.7 

1928 

10.0 

0.6 

1929 

10.1 

0.9 

1930 

10.4 

0.7 

1931 

11.0 

0.9 

1932 

10.9 

0.8 

1933 

10.4 

1.6 


(1) Find the equation of the re- 
gression line. 

(2) Interpret the value of rn. 

(3) Find S v using formula (1'), 
and interpret it. 

(4) If X = 8 find 7. Using S v 
interpret your result. 

(5) In 1935, X = 10.6. Com- 
pute Y and compare with the 
actual Y = 2.0. 


4. 


X 

y 

12.5 

74 

19.8 

170 

17.3 

147 

9.9 

57 

10.9 

75 

7.5 

46 

13.7 

130 

13.1 

89 

8.5 

59 

3.8 

20 

11.9 

90 

8.6 

74 

12.1 

41 

11.9 

77 

15.6 

144 


In the adjacent table 

X = value of crops (dollars per acre) 

Y = value of land and buildings (dollars per acre) in 
fifteen counties of Illinois in 1930. 

(1) Find the equation of the regression line. 

(2) Interpret the value of m. 

(3) Compute S v by formula (1'). 

(4) Compute Y for X = 10, and interpret your result 
with the aid of S y . 



MEASURES OF CONCENTRATION ABOUT LINE 237 


6 . In the following table 

X = average yield (bushels per acre) of corn, 1910-1919. 

Y — average land value (dollars per acre) on January 1, 1920 in 
twenty-five counties of Iowa. 


X 

Y 

X 

Y 

40 

87 

41 

193 

36 

133 

38 

203 

34 

174 

38 

279 

41 

285 

34 

179 

39 

263 

45 

244 

42 

274 

34 

165 

40 

235 

40 

257 

31 

104 

41 

252 

36 

141 

42 

280 

34 

208 

35 

167 

30 

115 

33 

168 

40 

271 

36 

115 

37 

163 




(1) Find the equation of the regression line. 

(2) Interpret the value of m. 

(3) Compute S y by formula (1'). 

(4) Compute Y when X ~ 40, and interpret 
your result. 


63. THE BUAVAIS-PEARSON COEFFICIENT 
OF CORRELATION 

By far the major objection to S v as a measure of the goodness 
of fit of a regression line to a swarm of points is this: it is a concrete 
number expressed in the given Y-unit. This fact renders it useless 
for purposes of comparison. What we really need is an index for 
measuring the closeness of lit that is independent of the unit of measure y 
a pure number, a relative which will measure the degree rather than 
the amount of the closeness with which the regression line estimates 
the observed values. We proceed to find such a measure. 

To accomplish this end, it is very enlightening to express S v in 
terms of the obsewed values. For the sake of simplicity, we shall 
assume that the observed data are referred to axes through (Mx, 
My)- From equation (8) on page 222 we have: 

2p 2 = m 2 2x 2 — 2mExy + 2 y 2 
where, in terms of the observed values, 



238 


SIMPLE CORRELATION 


If this value of m is substituted in the expression for 2p 2 , we obtain: 

2* 2 

\&X‘ / 

(2a:?/) 2 


2* - 2* - ■ 2.V + (g?Y • 


= 2y 2 - 


Sx 2 


= 2?/ 2 1 


(2x?/) 2 "I 

2x 2 22/ 2 J 


and 


02 = 2p_ 2 = 2^_Ti _ 
v n n L 2x 2 2?/ 2 _ 


Recalling that 


we have: 



and 


n 


= <Jy 


£ 2 = 
kj y 


<Ty 



(2xy) 2 ~ 

rial<j\_ 



f Xxy f 
,n<J x <J Y ) . 


and finally 
or 

where 


SI = crKl - r} y ) 
Sy — a yV 1 ~ 

XY rtax^Y 


( 2 ) 

(3) 


is the well-known Bravais-Pearson coefficient of correlation. 1 

Since x and y are the deviations of the observed values from their 
respective means, it is evident that r can be very simply computed 
from the observed values. 

The coefficient r plays such an important part in statistical analysis 
that it is advisable for us to show its relation to the slope, m, of the 
regression line. Thus: 

2 x 7 / nr<r x <r Y 

m “ 2x 2 ~ n<x 2 x 


or 



(4) 


1 As is our custom we shall omit the subscript XY employing it only for 
purposes of identification. 



COEFFICIENT OF CORRELATION 


239 


The equations of the regression line of F on X , (10) and (11) of 
Chapter 7 (p. 222), now become: 



and 

Y - M y = r a MX- Mx) (6) 

u x 

If S v is taken to be the measure of goodness of fit of the line of 
regression of Y on X to the observed points, or a measure of the 
closeness of the relationship of X and F, it will soon become evident 
that r is probably a superior measure for this relationship. From 
equation (2) it is evident that SI and crp are positive, and therefore 
r must lie in the interval — 1 to + 1. That is: 

~ 1 ^ r ^ + 1 

As r approaches unity numerically, S v decreases toward zero, 
and this occurs when the points in general cluster closely about the 
line. As r approaches zero, S v increases toward its maximum value, 
<t y , and this occurs when the points in general are widely dispersed 
about the line. If r equals unity numerically, SI equals zero, hence 
each residual must equal zero, and the observed points lie upon the 
line. When r equals unity numerically, we have what is known 
as 'perfect correlation between the variables X and F, for the lines 
of regression then describe the data perfectly. 

Therefore a high coefficient of correlation means a small S yy and 
consequently a close relationship between F and X y whereas a low 
coefficient of correlation means a large S Vy and consequently a poor 
relationship between X and F. 

Thus we have found our index for we see by (3) that r is a pure 
number (that is, it is independent of any units of measurement), 
and hence may be taken as a measure of the degree of the relationship 
between X and F. It may therefore be used to measure the relation- 
ship between variates expressed in any units, as, bushels and dollars, 
inches and pounds, marks in English and marks in mathematics on 
different scales, and so on. 

In Chapter 7 we learned that if the slope is positive, F increases as 



240 


SIMPLE CORRELATION 


X increases; if the slope is negative, Y decreases as X increases. 
From equation (4), since <r x an d Vy are always positive, m is positive 
if r is positive and m is negative if r is negative. Therefore, it fol- 
lows that if r is positive, Y increases with X, and if r is negative, 
Y decreases as X increases. The converse of this statement is also 
true. 

The remarks that we have just made about correlation have been 
from a mathematical standpoint. As we proceed in our study, 
however, these abstract notions will be clothed with real meaning. 
We are aware that certain characters tend to rise and fall together 
as if connected by some direct causal relation — for examples, tall 
men in general weigh more than short men, young husbands in 
general are married to young wives, a falling barometer usually 
signifies an approaching storm, an abnormally small crop in general 
results in a higher price for the product. In other words, we are 
aware of the existence of certain persistent relationships between 
pairs of variables. 

The existence of this persistent relationship between paired vari- 
ables is the important feature of correlation. The variables may 
in general fluctuate directly or inversely, that is, high values of 
one variable will in general be paired with high values of the other 
variable, or high values of one variable will in general be paired 
with low values of the other — in either case they are said to be 
correlated. 

Therefore: 

Correlation may be defined as tendency toward concomitant variation, 
and a so-called coefficient is simply a measure of such tendency, more or 
less adequate according to the circumstances of the case. 1 

In the few preceding pages we have suggested three expressions for 
this relationship, namely, (1) the equation of the line of regression, 
(2) the value of the standard error of estimate, and (3) the coeffi- 
cient of correlation. Each expression has its use, and we shall neglect 
none of them, but by far the greatest emphasis will be given the 
coefficient of correlation. 2 

1 William Brown and G. H. Thomson, Essentials of Mental Measurement, 
3d ed., 1921, p. 97. 

* If it is desired, Section 66 (p. 247) may now be read to advantage. 



COMPUTATION OF r FOR UNGROUPED DATA 241 


64. COMPUTATION OF T FOR UNGROUPED DATA 

Since r plays such an important r61e in the study of relationships, 
we shall devote considerable attention to its computation and to its 
interpretation. 

The following should be the tabular arrangement for computing r 
for ungrouped data when the computation is based upon formula (3). 



The following steps should be followed in the arithmetical summary: 

1. Find 2A r , then M x . 3. Find 2x 2 , 2xy, and 'Ey 2 . 

2. Find 2Y, then M y. 4. Find ax, <*Y, and r. 

5. Find rn from equation (4), or from m = 2x///2x 2 . 

6. Write the regression equation of Y on A r using equation (6). 

7. Obtain the computed values of Y if they are desired. 

8. Find S v from equation (2). 


The table on the following page will illustrate the steps recom- 
mended in the preceding summary. 


We have: 

n = 15 2X = 1402.8 2F = 876.4 
2 xy = - 1447.72 2x 2 = 2509.21 2 y* = 1852.48 

Mx = 93.5 bu. My = 58.4^ a x = 12.93 bu. <x Y = ll.llfi 
r = - 0.672 S v = 8.23^ 


m = 


- 0.672(11.11) 
12.93 


- 0.58 


or 


For the line of regression of Y on X we have: 

Y - 58.4 = - 0.58(X - 93.5) 

Y = - 0.58X + 112.63 



242 


SIMPLE CORRELATION 


Table 55. The Average Yield per Acre a.nd the Average Farm Price 
per Bushel for Potatoes in the United States, 1900-1914 1 


Year 

Yield 

(bushels) 

X 

Price 

(cents) 

Y 

X 

y 

xy 

x 2 

y 2 

1900 

80.8 

43.1 

-12.7 

-15.3 

194.31 

161.29 

234.09 

1901 

65.5 

76.7 

-28.0 

18.3 

-512.40 

784.00 

334.89 

1902 

96.0 

47.1 

2.5 

-11.3 

- 28.25 

6.25 

127.69 

1903 

84.7 

61.4 

- 8.8 

3.0 

- 26.40 

77.44 

9.00 

1904 

110.4 

45.3 

16.9 

-13.1 

-221.39 

285.61 

171.61 

1905 

87.0 

61.7 

- 6.5 

3.3 

- 21.45 

42.25 

10.89 

1906 

102.2 

51.1 

8.7 

- 7.3 

- 63.51 

75.69 

53.29 

1907 

95.4 

61.8 

1.9 

3.4 

6.46 

3.61 

11.56 

1908 

85.7 

70.6 

- 7.8 

12.2 

- 95.16 

60.84 

148.84 

1909 

106.1 

54.1 

12.6 

- 4.3 

- 54.18 

158.76 

18.49 

1910 

93.8 

55.7 

.3 

- 2.7 

- .81 

.09 

7.29 

1911 

80.9 

79.9 

-12.6 

21.5 

-270.90 

158.76 

462.25 

1912 

113.4 

50.5 

19.9 

- 7.9 

-157.21 

396.01 

62.41 

1913 

90.4 

68.7 

- 3.1 

10.3 

- 31.93 

9.61 

106.09 

1914 

110.5 

48.7 

17.0 

- 9.7 

-164.90 

....... i 

289.00 

94.09 

Total 

Mean 

1,402.8 

93.5+ 

876.4 

58.4+ 

.3 

.4 

-1,447.72 

2,509.21 

1,852.48 


We have here a fairly significant coefficient of correlation, r 
= — 0.672. Its large numerical value warrants our belief that there 
does exist a significant relationship between the average yield of 
potatoes and the corresponding price per bushel. The negative sign, 
as previously stated, means that as X increases Y decreases. In 
accordance with our definition of slope, the value of — 0.58 for m 
means that on the average, an increase of one bushel per acre in the 
yield will mean a diminished price of more than a half a cent per 
bushel. 

Now, let us use our equation for estimating the price that corre- 
sponds to a given yield, and S v for measuring the reliability of the 
estimate. LetX = 100 bu. per acre, then Y estimated is — 0.58(100) 
+ 112.63 = 54.6 cents. Since S v = 8.23 cents, the odds are 2 to 1 
that the observed Y for X = 100 does not differ from 54.6 cents by 
more than 8.23 cents. 


1 The data are taken from Yearbook of Agriculture , 1920, p. 616. 



COMPUTATION OF r FOR UNGROUPED DATA 243 


EXERCISES 

1. The average daily grades and the final examination grades for ten 
students in a class in calculus are given in the table below. 

X = the average daily grade 
Y — the grade on the final examination 

Find r, the line of regression of Y on X, and S y . If X = 90, Y = ( ). 


Student 

X 

Y 

Student 

X 

Y 

1 

86 

71 

6 

96 

94 

2 

93 

76 

7 

80 

71 

3 

73 

61 

8 

70 

60 

4 

66 

52 

9 

95 

85 

5 

88 

75 

10 

63 

55 


2. The following table 1 gives the results of experiments performed at 
Delhi, California, to determine the effect of irrigation upon the yield in 
alfalfa. 


Find r if 

X = the total seasonal depth (in inches) of water applied and 
Y = the average yield (tons per acre) for the years 1922, 1923, 1924 


X 

y 

X 

y 

12 

5.27 

36 

8.20 

18 

5.68 

42 

8.71 

24 

6.25 

48 

8.42 

30 

7.21 

60 

8.24 


3 . In the following table 2 

X = the July rainfall (in inches) for the given year for Ohio, and 
7 = yield of corn (bushels per acre) 

Find r, S V) and the regression equation. If X = 4, Y = ( ). Interpret. 

1 The data are from University of California Experiment Station, Bulletin 
No. 450, p. 8. 

1 The data are taken from Monthly Weather Review , Vol. 42 (1914), p. 80. 



244 


SIMPLE CORRELATION 


Year 

X 

HI 

Year 

X 

Y 

1900 

4.6 

42.6 

1905 

3.9 

37.9 

1901 

2.7 

30.0 

1906 

5.1 

42.2 

1902 

4.7 

38.8 

1907 

5.4 

34.8 

1903 

3.7 

31.5 

1908 

4.1 

36.1 

1904 

4.1 

32.8 

1909 

3.8 

38.7 


65. OTHER FORMS OF r 


The correlation coefficient, r, as we have defined it by equation (3) 
of Section 63 is expressed in terms of the deviations of the variates 
from their respective means, Mx and My . Since Mx and My usually 
require several decimals for their results, we shall follow the plan 
that we have used previously in Sections 34 (p. 125) and 44 (p. 164) 
in computing cr, a 3 , and a 4 . The labor of computation can be greatly 
reduced by expressing r in terms of the original variates X and F, 
or in terms of x' and y where x' and y ' are deviations in class units 
from some fixed origin (A, k). 

In Chapter 4 we have seen that : 


A fxx 2 “ 

= V ~~n Mx 

Similarly: 


or na x = VnXX 2 - (XX) 2 
or na Y = V nX Y 2 — (XY) 2 


Also, since 


we have: 


x = X - M x 
y=Y -M Y 


xy = 

and 

X xy = 

Recalling that 
we have: 


XY - M y X - M X Y + MxMy 
X XY - M Y XX - M x XY + nM x M Y 

XX = nM x and XY = nM Y 


Xxy = X XY - nM x My 


The formula for r can now be expressed in the useful form: 






OTHER FORMS OF r 


245 


r = 


2 XY - nMvMy 


Vsr - nMl V2F 4 - nMl 


(7) 


Formula (7) may also be written 

2XY 


- M X My 


T = 




(70 


We shall next derive a formula for r in which the X and Y variates 
will be expressed in their respective class widths as units and measured 
from some arbitrary origin (h f k). 

Let: C be the centroidal point (M X) M Y ) 

O' be the arbitrary origin (h, k ) 
w x = the class width of the X variates 
w v = the class width of the Y variates 

Sx' 

M x — h + w x b x where b x = or nb x = 2x' 

4 r 2 

<?x = w-y — bl 

Similarly, we can find: 

M Y — k + Wyby where b y = or nb y = 2*/' 

Ti 



From Figure 40 we have the following relations: 

a. X — x + M x b. X = h + w x x' c. x = WxX f — w x b x 
Y = y + M y Y = k + w v y’ y = w„2/' - w y b 9 

Applying the relations c. above, we have: 

xy = w x w v (x'y' — b y x’ — b x y ' + 6 x 6 y ) 

and hence 

2 xy = w x w y (2x'y' — b y 2x' — 6 x 2y' + nb x b y ) 
Substituting 2x' = nb x and 2 y f = nb V) we have: 

2 xy = w s w v (2ix'y' — n6*6 y ) 



246 


SIMPLE CORRELATION 


Figure 40 



Replacing in equation (3) 2 xy by the value just found and <r x and 
< r Y by their values in terms of the primed letters, we have: 


r = 



(8) 


the class widths canceling in the process. 

By simple transformations equation (8) reduces to. 

nZx'y' - 2s'2y' 

1 VnSx 7 * - - (Sx') a VnS7» - 


(9) 


The following example will illustrate the method of procedure for 
computing r by either formula, (8) or (9). 

X = the grade on the first test 
Y ~ the grade on the second test 


OTHER FORMS OF r 


247 


Table 56. Grades of Two Tests of 10 Students 
in Integral Calculus 


Student 

X 

Y 

x' 

y' 

x'y f 

x' 2 

v'* 

1 

85 

77 

10 

7 

70 

100 

49 

2 

82 

77 

7 

7 

49 

49 

49 

3 

91 

82 

16 

12 

192 

256 

144 

4 

80 

74 

5 

4 

20 

25 

16 

6 

75 

70 

0 

0 

0 

0 

0 

6 

95 

87 

20 

17 

340 

400 

289 

7 

83 

77 

8 

7 

56 

64 

49 

8 

85 

77 

10 

7 

70 I 

100 

49 

9 

88 

82 

13 

12 

156 

169 

144 

10 

77 

71 

2 

1 

2 

4 

1 

Total 



91 

74 

955 

1,167 

790 


Let h = 75, k = 70, w x — 1, and w y — 1. 

We have from the table: 

2a/ = 91 2?/' = 74 SxY = 955 Sx' 2 = 1167 2?/ 2 = 790 and n = 10 


Hence: 

6,-9.! 6, -7.4 5^- 


n 


95.5 — = 116.7 


?y ' 2 


n 


n 


= 79 


Therefore by (8) : 


r = 


95.5 - (9.1) (7.4) 


Vll6.7 - 82.81 V79 - 54.76 


0.98 


60. SUMMARY AND EXTENSION OF THE THEORY 
OF CORRELATION 

In Chapters 7 and 8 we have assumed that our data could be 
represented by the straight-line equation, Y = nuX + bi, in which 
X is the independent variable and Y the dependent variable. By 
minimizing the sum of the squares of the F-residuals, we derived the 
normal equations: 

m,2X + nf>! - ZF 
mi 2X S + hiZZ = 2XF 



248 


SIMPLE CORRELATION 


Solving these normal equations for mi and b h we obtained 


_ nLXY - 2AT2F 

77,1 " nSZ 2 - (2*)* 

2**2 Y - 2JT2 XY 
1 “ n2r* - (2JT)» 


(5) of Section 59 


and hence the equation of the line of regression of F on X may be 
found. This equation, for assigned values of X, gives the most 
probable values for F. This line (see Exercise 4 on p. 221) passes 
through the point (Mx, My) and (see Exercise 5 on p. 221) also 
possesses the property that the sum of the F-residuals from it is zero. 

If the square root of the mean of the squares of the F-residuals be 
taken as a measure of the closeness of the concentration of the points 
about the line, we find: 


where 

Then: 


S y = <j y v 7 1 — r 2 
r = 

n<T x <Ty 



(2) of Section 63 

(3) of Section 63 

(4) of Section 63 


and the equation of the line of regression of F on X becomes: 


Y - 


My = &(X - M x ) 
a x 


(6) of Section 63 


In like manner we may arrive at similar results by basing our 
procedure upon the equation X = m^Y + fr 2 , where F is now the 
independent variable and X is the dependent variable. 1 If we 
minimize the sum of the squares of the X-residuals we arrive at the 
normal equations: 

m 2 2F + nb 2 = 2X 
m 2 S F 2 + 6 2 2F = 2XF 


If these equations be solved for m 2 and b 2) we obtain: 


1 Note that m 2 is not the slope of this line. 



SUMMARY AND EXTENSION 


249 


nZXY - 2 X2Y 1 

m ~ nSP - (Sy) 4 

21*2* - 2F2AT 
° a “ n2P - (2y) 4 , 


( 10 ) 


and hence the equation of the line of regression of X on Y may be 
found. This equation, for assigned values of Y, gives the most 
probable values of X. This line also passes through the point 
(Mx, My) and possesses the property that the sum of the X-residuals 
from it is zero. 

If the square root of the mean of the squares of the X-residuals be 
taken as a measure of the closeness of the concentration of the 
points about this line of regression of X on Y, we find: 

s x = (id 

where, as before, 

r = 

TUT x<7 Y 

The value of ra 2 may now be written 

m 2 = r— (12) 

<Ty 

and the equation of the line of regression of X on Y may be written: 



or 

X - Mx = r^(Y - My) (13) 

<T Y 

We can therefore obtain two straight lines which fit the given n 
points according to the principle of least squares. We can minimize 
the sum of the squares of the F-residuals of the line Y = m\X + b x 
and obtain the regression line of Y on X given by equation (6). 
This line is to be used to find the most probable Y for a given X. 
We can minimize the sum of the squares of the X-residuals of the 
line X = m 2 F + b 2 and obtain the regression line of X on Y given 
by equation (13). It is to be used to find the most probable X for 
a given Y . 



250 SIMPLE CORRELATION 


Question : For what values of r will the lines (6) and (13) coin- 
cide? 

The quantities m x and m 2 are called coefficients of regression . It 
may be noted that: 

r 1 2 = m x m 2 (14) 

If the deviations of the X and Y variates. from their respective 
means be expressed in units of their standard deviations, that is, if 



X - M x 


and 


s 


y_ 


Y - My 

, 

cr Y 


the equation (6) becomes: 

s = rt (15) 

That is, r is the slope of the line of regression of Y on X when the 
variates x and y are expressed in standard units. 


EXERCISES 

1. Using the data in Table 12 (p. 47), find the correlation between the 
quantity of beef available for consumption and the price per hundred- 
weight. Let X equal the quantity available and Y equal the price. 

2. In the following table the cows considered were of the same breed 
under the same management. Find r. 

Value of Food Consumed by 26 Cows and Value of Products per Cow 1 


Value of Feed 
Consumed 

X 

Value of Product 
per Cow 

Y 

Value of Feed 
Consumed 

X 

Value of Product 
per Cow 

Y 

$99.83 

$246.10 

$98.93 

$174.64 

86.42 

207.76 

82.69 

143.61 

91.05 

216.52 

82.94 

143.18 

94.05 

220.01 

87.03 

150.02 

94.06 

214.87 

89.07 

153.51 

86.06 

183.53 

83.52 

K Wii WHHM 

84.20 

176.39 

83.10 


86.70 

178.56 

89.16 


86.75 

178.11 

83.01 


86.57 

166.70 

89.32 


88.52 

169.20 

82.22 


94.01 

179.25 

99.74 


86.23 

157.20 

84.77 



1 The data are from Horace Secrist, Readings and Problems in Statistical 

Methods, 1920, p. 420. 


SUMMARY AND EXTENSION 


251 


3. In the following table: 

X = production in million of bales 

Y = price per pound in cents received by producers December 1 
Production and Price of Cotton in the United States , 1 1907-1929 


Year 

X 

Y 

Year 

X 

Y 

Year 

X 

y 

1907 

11.1 

10.4 

1920 

13.4 

13.9 

1917 

11.3 

27.7 

1908 

13.2 

8.7 

1921 

8.0 

16.2 

1918 

12.0 

27.6 

1909 

10.0 

13.9 

1922 

9.8 

23.8 

1919 

11.4 

35.6 

1910 

11.6 

14.1 

1923 

10.1 

31.0 




1911 

15.7 

8.8 

1924 

13.6 

22.6 




1912 

13.7 

11.9 

1925 

16.1 

18.2 




1913 

14.2 

12.2 

1926 

18.0 

10.9 




1914 

16.1 

6.8 

1927 

13.0 

19.6 




1915 

11.2 

11.3 

1928 

14.3 

18.0 




1916 

11.5 

19.6 

1929 

14.5 

16.4 





a. Find r for the ten-year period, 1907 to 1916 inclusive. 

b. Find r for the ten-year period, 1920 to 1929 inclusive. The years 
1917, 1918, and 1919 were abnormal years, and may be omitted from the 
computation. 


4 . Using the relation 

(x - y) 2 = x 2 - 2 xy + y 2 

show that: 


r 


<rx + Qy — <Tx- 
2ct\-<t y 


Y 


In order to compute the value of r by tills formula, what are the 
implied restrictions upon the X and Y units? 


5. Verify the value of r for the data of Table 56 by using the formula 
of Exercise 4. 

6 . Find the value of r for the “Savings Bank, Strikes and Lockouts” 
data of Exercise 3, page 236. Is the formula of Exercise 4 applicable to 
these data? 

7 . Show that 


8 . Given 


Txy 


TaX+b 

1 


cY+d — Txy 


<Tp 


<?Y 




1 The data are from Yearbook of Agriculture , 1928, p. 837; Commerce Year- 

book , 1930, p. 216. 



252 


SIMPLE CORRELATION 



9. For a given X we estimate Y (call it Y ett ) by the equation (see 
Section 63) 

Y mI . = (X - M x ) + My 
Vx 

(a) Show that the arithmetic mean of the estimated values of Y is equal 
to the arithmetic mean of the observed values of F, or that 

Mytst. — My 

(b) Show that 

(That. = r 1 2 3 4 5 6 7 <Ty 


and thus that 

the sign to be that of m. 



(1) Plot the data. 

(2) Complete the table. 

(3) Compute <r x - 

(4) Compute a Y . 

(5) Compute r. 

(6) Compute m. 

(7) Find the regression line. 



Treat the data in the accompany- 
ing tables as you did those in 
Number 10 above. 



Treat the data in the accompany- 
ing tables as you did those in 
Number 10 above. 



SUMMARY AND EXTENSION 


253 


13. The equations of the lines of regression for a set of data are y = 0.72a: 
and x = 0.64r/. What is the value of r for the data? 

14. The equations of the lines of regression for a set of data are 
y = — 0.8a; and z = — 0.45t/. What is the value of r? 

67. COMPUTATION OF r FOR GROUPED DATA 

For sufficiently small values of n, say n < 30, one of the methods 
employed in the preceding sections is usually used in computing the 
coefficient of correlation. 

If n is a very large number, we are compelled to construct a double- 
entry table. To construct such a table (see Table 57), the sheet is 
ruled horizontally and vertically, thus dividing the sheet into a 
system of columns and a system of rows, each of which is a frequency 
distribution. Each of the rectangles in a row or column is called a 
cell. Along the left-hand margin from bottom to top are laid off the 
class intervals or the class marks of the Y variates, and along the 
top of the diagram from left to right are laid off the class intervals 
or the class marks of the variates. Very much as we plot points 
on an ordinary A-, F-coordinate system, each observed individual 
may now be located on this sheet, the preliminary or tally sheet , 
with respect to the X and Y measures. We shall locate each indi- 
vidual on the preliminary sheet with a + sign placed within the 
appropriate cell. Since we shall finally concentrate all the measures 
in a given cell at its center, it is not necessary that the points be 
plotted with more precision than is necessary to locate them in the 
appropriate cells. When all the individuals are accurately located 
we have a scatter diagram. 


Table 57 




254 


SIMPLE CORRELATION 


A correlation table may now be obtained from the preliminary sheet 
by writing within each cell the number of + marks which fall within 
it. This number is called the cell frequency. We shall indicate a cell 
frequency by f(x, y). The table is now used for a worksheet or a 
computation sheet . 

The numbers in a column corresponding to an assigned X, say 
X = X h form a F-array of the type X h and those in a row corre- 
sponding to an assigned Y, say Y = Y h form an X-array of type Y\. 

The correlation table may be represented geometrically by a sur- 
face. At the center of each cell imagine a vertical erected with a 
height proportional to the cell frequency. If the tops of these verticals 
be joined, an irregular surface results. If the cells are made smaller 
and smaller while the frequencies remain finite, the irregular surface 
will approach a regular surface which is called a frequency surface 
or a correlation surface. 

Since in this chapter we are dealing with grouped data, it is ad- 
visable that we write our formulas for r in the frequency forms. 
Thus equations (3), (7'), (8), and (9) become: 


r = y} 

Tl(T gCF y 

nZAT/Or, y) - ZX/MSWiri 

ViSxW) - [2TOT 


(16) 

( 17 ) 

( 18 ) 


TiZ,x'y'f(x, y) - Sx'/(s)Sy'/(y) 

VnZiW) - TOFU) - [2{/7(l/)] J 


( 19 ) 


The data of Table 58 will be used as an example to illustrate the 
construction of the preliminary sheet, the correlation table, and the 
method employed in computing r, the regression equations, and 
the standard error of estimate. 

We shall let the percentage of native white population be measured 
along the horizontal or X-axis, and the percentage of illiteracy be 



COMPUTATION OF r FOR GROUPED DATA 


255 


Table 58. Percentage of Native White and Percentage of 
Illiterate Ten Years of Age and Over in the Popula- 
tion of Pennsylvania, by Counties, 1920 1 


County 

m 

Percentage 

Illiterate 

Y 

County 


Percentage 

Illiterate 

Y 

Adams 

98.7 

1.6 

Lackawanna. . . . 

77.1 

8.6 

Allegheny. . 

74.5 

4.8 

Lancaster 

96.3 

1.4 

Armstrong . 

86.4 

4.3 

Lawrence 

80.2 

6.5 

Beaver .... 

75.9 

6.2 

Lebanon 

95.3 

2.5 

Bedford .... 

96.9 

3.2 

Lehigh 

89.3 

2.3 

Berks 

93.4 

2.7 

Luzerne 

77.3 

9.5 

Blair 

92.2 

2.6 

Lycoming 

94.4 

1.5 

Bradford. . . 

96.2 

2.0 

McKean 

86.2 

1.8 

Bucks 

87.6 

2.5 

Mercer 

80.0 

6.2 

Butler 

89.6 

3.0 

Mifflin 

96.4 

2.4 

Cambria . . . 

79.2 

6.3 

Monroe 

95.0 

2.3 

Cameron. . . 

89.9 

2.6 

Montgomery. . . 

83.4 

3.6 

Carbon .... 

82.3 

8.3 

Montour 

94.2 

4.8 

Center 

93.5 

1.8 

Northampton. . . 

81.9 

5.2 

Chester .... 

81.7 

4.5 

Northumberland 

88.9 

4.7 

Clarion .... 

96.4 

1.8 

Perry 

98.9 

1.5 

Clearfield . . 

85.9 

4.4 

Philadelphia. . . . 

70.7 

4.0 

Clinton .... 

93.0 

2.5 

Pike 

90.6 

1.3 

Columbia . 

93.0 

2.8 

Potter 

93.0 

1.9 

Crawford . . 

92.3 

1.5 

Schuylkill 

84.0 

7.9 

Cumberland 

96.2 

1.4 

Snyder 

99.8 

2.1 

Dauphin . . . 

87.8 

3.3 

Somerset 

84.3 

6.4 

Delaware. . . 

75.8 

4.4 

Sullivan 

90.4 

4.5 

Elk 

81.6 

3.1 

Susquehanna . . . 

91.0 

2.8 

Erie 

84.8 

4.0 

Tioga 

93.4 

2.4 

Fayette 

76.3 

8.2 

Union 

99.3 

1.4 

Forest 

94.1 

2.7 

Venango 

92.7 

3.5 

Franklin . . . 

97.0 

1.8 

Warren 

86.2 

3.3 

Fulton 

98 9 

2.3 

Washington. . . . 

74.0 

7.3 

Greene 

93.5 

4.4 

Wayne 

90.9 

2.8 

Huntingdon 

93.1 

4.0 

Westmoreland . . 

77.8 

7.6 

Indiana 

82.4 

5.9 

Wyoming 

96.0 

1.6 

Jefferson . . . 

88.0 

3.5 

York 

97.2 

1.6 

Juniata .... 

99.1 

1.2 





measured along the vertical or F-axis. The class widths, w x and w y , 
may be selected in accordance with the principles suggested in 

1 The data are from Fourteenth Census of the United States , Vol. Ill, pp. 859-65. 



Percentage Illiterate 


256 


SIMPLE CORRELATION 


Preliminary Sheet 
Percentage Native White 



Section 13 (p. 30). Since the X variates range from 70.7 to 99.8, 
we shall choose w x = 3, and since the Y variates range from 1.2 to 
9.5, we shall choose w y = 1. Also, since the given measures are 
accurate to tenths, we shall express our class boundaries to hun- 
dredths. 1 Plotting the points, we have the preliminary sheet. 

The preliminary sheet is now complete. We are now ready to 
transcribe the results of the tally to the computation sheet. We 
then have Table 59. 

Having formed the correlation table, which is the part of the 
table bounded by the double lines, we arrange the computation to 

1 If the student prefers he may use some other method for fixing the class 
limits. Any method recommended in Section 12 (p. 23) will be satisfactory. 
Thus the X-class intervals may be 70.0-72.9, 73.0-75.9, etc., and the F-class in- 
tervals may be 1.0-1. 9, 2.0-2. 9, etc. The class marks will be changed accordingly. 
The X-class marks will become 71.45, 74.45, etc., and the F-class marks will be- 
come 1.45. 2.45. etc. 



COMPUTATION OF r FOR GROUPED DATA 25? 

simplify as far as possible the somewhat complicated details. We 
first add the frequencies of the rows and columns and obtain the 
row marked f(x) and the column marked f(y) . Choosing an arbitrary 
origin {h, k ) near the center of the table — in Table 59 ( h , k ) 
= (83.45, 4.45) — and the class intervals as units of measurement, 
we obtain the row marked x' and the column marked y' . That is, we 
use the familiar transformations X = h + w x x’ and Y = k + w y y'. 
The next two rows, x'f(x) and x' 2 f(x), and the next two columns, 
y'fiy) and y n f{y), are self-explanatory and are used in computing 
the means, M x and M y , and the standard deviations, cr x and a Y . 

The column headed x'y'f{ x, y) needs some explanation. Recalling 
formula (18) for computing r, we note that we must find 'Lx'y'fix, y). 
That is, we must find the x'y' for each individual measured, then find 
their sum. Since the frequency of any cell is concentrated at the 
center of the cell, we shall compute the x'y' for the frequency of each 
cell by multiplying the x'y' of each cell by the cell frequency, and 
adding the x'y' for all the cells of a given row. In this manner we 
obtain the numbers in the column headed x’y'f(x, y). By adding the 
x’y' of all the rows, we obtain the sum of the x'y' of the entire table. 1 
Thus: 

for row Y = 8.45, the total x'y' is (— 2) (4)2 + (0)(4)1 = — 16 
The total x'y' for each of the other rows is found in a similar manner. 

Consequently, for the entire distribution we have: 

n = 67, h = 83.45, k — 4.45, w x = 3, w y = 1 
2 x'f(x) = 116 ~Zy'f(y) — — 52 2 x'*f(x) = 612 

2 y'*f(y) = 342 2x'y'f(x,y) = - 362 

Therefore: 

b x = M x = 83.45 + 3 (^~j = 88.64% 
b v = M Y = 4.45 - jg = 3.67% 

jmsrrv, . s /f7(fy . 2 .48 

a x = 3(2.48) = 7.44% 

1 A row x'y’f(i, y) is similarly found. It is useful for checking the column 
x’y’fix, y). 



Table 59. Computation Sheet for Finding r for the Data of Table 


258 


SIMPLE CORRELATION 





- 104 - 110 - 362 k Check 

























COMPUTATION OF r FOR GROUPED DATA 


259 





= 2 . 12 % 


cr r = 2.12% 


Using equation (18) we have: 


- 362 / 1 16\ / — 52\ 

67 \ 67 /\ 67 ) 

(2.48) (2. 12) 


0.77 


The equations of the lines of regression can now be found: 


Using 


m , - r *r - - 0-77(2.12) _ 
1 a x 3(2.48) 


V - M y = mi(X - M x ) 


we obtain the equation for the regression of F on X with its S y . It is: 
F - 3.67 = - 0.22(X - 88.64) or F = — 0.22X + 23.17 
S y = 2.12VT— (.77)* = 1.35% 


For a given value of X, this equation gives the best (that is, the 
most probable) value for Y. This most probable value of Y is the 
mean of the F-array corresponding to a given value of X. Hence, 
the equation above gives the expected mean 1 of the F-array for 
a given X. We use S y to measure its reliability. 

For example, if X = 86.45, we obtain F = 4.15 for the estimated 
or expected mean of the F-array. We may compare this with the 
observed mean for X = 86.45 by computing the mean of the dis- 
tribution in the usual manner. We find the observed mean for the 
F-array corresponding to X = 86.45 to be 3.28. 

When X = 86.45 we found the estimated F, Y ea t.> to be 4.15. 
Combining this value with its measure of reliability S v = 1.35 we 
have this fact: the odds are 2 to 1 that the observed F for X = 86.45 
does not differ numerically from Y es t. - 4.15 by more than 1.35. 

1 It may be shown that the line of regression of Y on X is the line which best 
fits the points which designate the means of the F-array s or columns, and that 
the line of regression of X on Y best fits the points which designate the means of 
the X-arrays or rows. 



260 


SIMPLE CORRELATION 


In other words, the odds are 2 to 1 that for X = 86.45% the ob- 
served Y will lie in the interval 4.15 =b 1.35%. 


Using 


m 2 = 


<J X - 0.77(7.44) 

r a Y 2.12 


0.27 


X - M x = m 2 (7 - M Y ) 


we obtain the equation for the regression of X on Y with its S x . It is: 

X - 88.64 - - 0.27(y - 3.67) 
or 

X = - 0.27 Y + 89.63 
S x = 7.44V 1 - iff) 2 = 4.75% 

For a given value of Y, this equation gives the most probable 
value for X. That is, for a given Y, this equation gives the expected 
mean of the corresponding X-array. 

For example, if Y = 3.45, we obtain X = 88.70 for the estimated 
mean of the array. The observed mean of the X-array corresponding 
to 7 = 3.45 is X = 87.95. We use S x to measure the reliability 
of the estimate. Thus the odds are 2 to 1 that for 7 = 3.45% the 
observed X will lie within the interval 88.70 =fc 4.75%. 

This completes the theory of simple linear correlation. A word 
about the reliability of r may be in order. If n is fairly large and 
if the surface described on page 254 is closely normal, the reliability 
of r may be tested by either of the formulas: 

1 — r 2 

*-“vT 

■j „2 

E r = 0.6745OV = 0.6745— 7=- 

vn 

with the interpretation of <r r and E r similar to that employed in 
Section 37. Since the assumptions underlying these formulas are 
rather severe, they are to be used with care. 


EXERCISES 

1. The data for the table on page 261 are taken from the Yearbooks of 
Agriculture: 1920, pp. 753 and 537; 1935, pp. 568 and 379. 

X = price of corn per bushel (cents) 

Y = value of hogs per head (dollars) 



COMPUTATION OF r FOR GROUPED DATA 261 

Construct a correlation table with the X-classes: 20 a.u. 35, 35 a.u. 50, 
etc., and the F-classes 3.00 a.u. 6.00, 6.00 a.u. 9.00, etc. 

Find r, the equation of the regression line of Y on X, and S v . 

Estimate Y when X = 75 and give the odds that measure the reliability 
of the estimate. 


Year 

Corn 
Cents 
per bu. 

X 

Hogs 
Dollars 
per head 

Y 

Year 

Corn 
Cents 
per bu. 

X 

Hogs 
Dollars 
per head 

Y 

1870 

49 

$5.80 

1905 

41 

$5.99 

1871 

43 

5.61 

1906 

40 

6.18 

1872 

35 

4.01 

1907 

52 

7.62 

1873 

44 

3.67 

1908 

61 

6.05 

1874 

58 

3.98 

1909 

58 

6.55 

1875 

37 

4.80 

1910 

48 

9.17 

1876 

34 

6.00 

1911 

62 

9.37 

1877 

35 

5.66 

1912 

49 

8.00 

1878 

32 

4.85 

1913 

69 

9.86 

1879 

38 

3.18 

1914 

64 

10.40 

1880 

40 

4.28 

1915 

58 

9.87 

1881 

64 

4.70 

1916 

89 

8.40 

1882 

49 

5.97 

1917 

128 

11.75 

1883 

42 

6.75 

1918 

137 

19.54 

1884 

36 

5.57 

1919 

135 

22.02 

1885 

33 

5.02 

1920 

68 

19.08 

1886 

37 

4.26 

1921 

53 

12.99 

1887 

44 

4.48 

1922 

75 

10.06 

1888 

34 

4.98 

1923 

84 

11.58 

1889 

28 

5.79 

1924 

105 

9.72 

1890 

51 

4.72 

1925 

70 

12.38 

1891 

41 

4.15 

1926 

75 

15.21 

1892 

39 

4.60 

1927 

85 

15.97 

1893 

37 

6.41 

1928 

84 

12.03 

1894 

46 

5.98 

1929 

80 

12.24 

1895 

25 

4.97 

1830 

59 

12.73 

1896 

22 

4.35 

1931 

32 

10.75 

1897 

26 

4.10 

1932 

32 

5.80 

1898 

29 

4.39 

1933 

52 

3.99 

1899 

30 

4.40 

1934 

85 

3.92 

1900 

36 

5.00 




1901 

61 

6.20 




1902 

40 

7.03 




1903 

43 

7.78 




1904 

44 

6.15 





262 


SIMPLE CORRELATION 


2. The accompanying table shows the scores on placement examinations 
of 326 freshmen at Bucknell University. Find r and the equations of the 
lines of regression. 


Examination Scores in Mathematics and English 


Mathematics 



3. In the following table 

X = the number of minutes required to solve a group of arithmetical 
exercises by each of forty employees 
Y = the executive ratings, in per cent, of the same employees 

























COMPUTATION OF r FOR GROUPED DATA 263 

Construct a double entry table with the X-classes designated as 8.0 a.u. 
12.0, 12.0 a.u. 16.0, etc., the Y-classes designated as 60 a.u. 65, 65 a.u. 70, etc. 

Find r, the regression line of Y on X, and S v . 

What is the estimated value of Y for X — 20, and what is the reliability 
of the estimate? 

68. CORRELATION BY RANKS 

When two series of values are expressed according to their ranks 
and not in terms of their actual values or scores f we can easily find 
the approximate correlation between them. Such correlation is used 
to find the relation between the paired scores when their number 
is small or when the data do not warrant an application of the cross 
product method to the actual values. Also, the method is useful 
in finding the correlation between series that may be arranged ac- 
cording to size and yet may not be subjected to exact measurement. 

In such correlation as we are here describing we must keep in 
mind that the ( X , Y) values are the rank or 'position numbers of some 
characteristics. We shall arrange the values in ascending order. 
To the smallest value we assign 1, to the next in order 2, etc. We 
may then find the rank correlation by employing any of our formulas 
for r X y with the data arranged according to ranks. However, a 
formula may be easily derived for this special case by a method which 
we shall indicate at the end of this section. When ranks are used 
we indicate the coefficient by p XY or by r XY (rank). 

To illustrate the problem we are presenting, let us consider the 
heights and weights of the five boys A, B, C, D, E. 


Table 60. Heights and Weights of Five Boys 


Boy 

Height 
( inches ) 

Weight 
( pounds ) 

Rank in Height 

X 

Rank in Weight 

Y 

A 

60 

137 

1 

2 

B 

62 

132 

2 

1 

C 

63 

148 

3 

3 

D 

65 

157 

4 

5 

E 

68 

153 

5 

4 


For the height-weight data given in columns 2 and 3, r height weight 
= 0.77. 

Let us find the cross product coefficient for the rank data given in 
columns 4 and 5 of Table 60. 



264 


SIMPLE CORRELATION 


X 

Y 

X 

y 

s* 

y % 

xy 

X 1 

7* 

XY 

1 

2 

- 2 

- 1 

4 

1 

2 

1 

4 

2 

2 

1 

- 1 

- 2 

1 

4 

2 

4 

1 

2 

3 

3 

0 

0 

0 

0 

0 

9 

9 

9 

4 

5 

1 

2 

1 

4 

2 

16 

25 

20 

5 

4 

2 

1 

4 

1 

2 

25 

16 

20 

15 

15 

0 

0 

10 

10 

8 

55 

55~ 

53 


M x = M Y — -nr = 3 a x — & y ~ = V2 


Pxy = r xy (rank) = — = — 7 = — 7 = = 77 ^ = 0.80 

Ay ' ncr x <T Y 5V2V2 10 


We may also find p*y = r XY (rank) by using formula (7) page 245, 
2 XY - nM x M Y 


r = 


We hare 


na x a Y 


= s/W - {Mxy = \/f - 9 = ^ 

<r y = ^ - (M y )* = y/f -9 = V2 


Hence 


Pat = ^at (rank) 


53 - 5(3) (3) = 
5V2 V2 


Thus we see that the so-called rank difference method is merely 
the cross-product correlation between the rank numbers of the 
variates. As might be suspected, frequently certain complications 
arise to interrupt the apparently simple ranking of the values. 
Generally there are several scores of the same size, or there exist 
ties in the ranks. In such cases it is customary to give each the 
mean of the ranks of the positions that they occupy. Thus, suppose 
3 tied for fifth place. Had there been no ties, the ranks would have 
been 5, 6, 7. We arbitrarily assign to each place the rank number 6, 
which is the mean of 5, 6, and 7. If 2 scores tied for the eighth 
place, we would assign each the rank number 8.5. 

We shall now proceed to develop a formula for finding the rank 
coefficient p XY . 



CORRELATION BY RANKS 


265 


Evidently the X-values are the numbers 1, 2, 3, . . . , n, and the 
F-values are the same numbers but probably arranged in a different 
order. Hence 


HjX = 2F = 1 + 2 + 3 + * • ' u 


M x = My 
Also 


1 _ ( n ± j) 

n 2 2 


n(n + 1) 
2 


2X 2 = 2F 2 = l 2 + 2 2 + 3 2 + • • • + n 2 = l)(2n + 1) 

6 


Hence, using formula (7) page 128, 


From 


- «Y = \J 


( n + l)(2n. + 1) 
6 


m - ^ 


2(X - F) 2 = 2X 2 - 22 XF + 2F 2 
we obtain, substituting values for 2X 2 and 2 F 2 above, 
_ n(n + l)(2n+ 1) 2(X - F) 2 

= e 2 


Now, substituting in (7') page 245, the values found, we obtain 
after simplifying 

Pxr = r xr (rank) = 1 - (20) 

Thus, after our data are ranked the computation of p X y is de- 
cidedly simple. To illustrate the use of formula (20) let us return 
to the height- weight data. We have the following table with headings 
suitable to the use of formula (20). 


Rank in Height and Weight of Five Boys 


Boy 

Rank in Height 

X 

Rank in Weight 
Y 

X - Y 

c X - Yy 

A 

1 

2 

- 1 

1 

B 

2 

1 

1 

1 

C 

3 

3 

0 

0 

D 

4 

5 

- 1 

1 

E 

5 

4 

1 

1 



266 


SIMPLE CORRELATION 

n = 5 2(X — Y) 2 = 4 


6(4) _ 24 

5(25 - 1) 5(24) 


\ = 0.80 
5 


EXERCISES 

1. Ten examination papers in algebra were read by two judges and 
ranked according to merit. The following table shows the results of the 
rankings. Find p XY - 


Examination 

Paper 

Rank by 
Judge No. 1 

X 

Rank by 
Judge No. 2 

Y 

1 

6 

5 

2 

2 

1 

3 

4 

6 

4 

8 

9 

5 

1 

2 

6 

3 

3 

7 

7 

7 

8 

5 

4 

9 

10 

8 

10 

9 

10 


2 . The following table gives the ranks of 10 salesmen by the sales 
manager of a corporation and also the ranks of the 10 salesmen on a 
psychological test. Find p XY • 


Salesman 

Rank by 

Sales Manager 

Rank 
on Test 

Jones 

1 

1 

Smith 

2 

3 

Brown 

3 

2 

Kelly 

4 

6 

Sanders 

5 

7 

Benson 

6 

4 

Owens 

7 

8 

Miller 

8 

5 

Borden 

9 

9 

Peterson 

10 

10 


3. From the following table, by the method of ranks find the correla- 
tion between the grades in Test I and Test II ; between the grades in Test I 
and Test III; between the grades in Test II and Test III. 



CORRELATION BY RANKS 


267 


Grades of 21 Students in Three Tests in Integral Calculus 


Student 

Test I 

Test II 

Test III 

Student 

Test I 

Test II 

Test III 

1 

80 

45 

55 

12 

99 

87 

99 

2 

60 

50 

80 

13 

82 

70 

72 

3 

94 

81 

95 

14 

98 

83 

92 

4 

93 

85 

90 

15 

97 

95 

93 

5 

87 

80 

70 

16 

34 

55 

30 

6 

95 

90 

100 

17 

96 

96 

96 

7 

74 

60 

60 

18 

74 

20 

40 

8 

61 

79 

85 

19 

62 

72 

75 

9 

92 

82 

71 

20 

63 

94 

91 

10 

67 

84 

97 

21 

88 

78 

94 

11 

100 

86 

98 






69. CORRELATION AND CAUSATION 

The correlation coefficient, as we have used the term, is a mathe- 
matical expression which measures the mathematical relationship — 
based upon linear regression, or the best-fitting straight line to the 
data — that exists between two variables X and Y . It must not 
be supposed that a low co- 


efficient of correlation proves a 
lack of relationship between 
the two variables. Consider 
the data of the Table 61. We 
note that for these data: 

M x = 0 

and 

2XF = 0 

Hence by equation (7): 

r = 0 


Table 61 


X 

Y 

XY 

0 

5 

00 

3 

4 

12 

4 

3 

12 

5 

0 

00 

- 3 

4 

- 12 

- 4 

3 

- 12 

- 5 

0 

00 

0 

19 

00 


That is, based upon the best-fitting straight line the data show a very 
poor relationship or a straight line of very poor fit. 

But based upon the semicircle, Y = + V25 — X 2 , we have perfect 
correlation , since each point is on the curve. This simple illustration 
emphasizes a fact that we should keep in mind, namely, that the 
Bravais-Pearson cross-product formula is based upon straight-line 
regression. 



268 


SIMPLE CORRELATION 


It should also not be supposed that the existence of high coefficient 
of correlation between two variables proves any necessary and inherent 
causal relationship between the two — that is, that one is the abso- 
lute cause of the other. Consider the following table: 


Table 62 1 


Year 

X 

Y 

X 1 

Y 2 

XY 

1870 

38 

30 

1,444 

900 

1,140 

1875 

55 

38 

3,025 

1,444 

2,090 

1880 

56 

51 

3,196 

2,601 

2,856 

1885 

73 

69 

5,329 

4,761 

5,037 

1890 

92 

97 

8,464 

9,409 

8,924 

1895 

114 

114 

12,996 

12,996 

12,996 

1900 

138 

135 

19,044 

18,225 

18,630 

1905 

177 

169 

31,329 

28,561 

29,913 

1910 

254 

205 

64,516 

42,025 

52,070 

Total 

997 

908 

149,283 

120,922 

133,656 


Applying formula (7) we obtain 

r = 0.98 

which is so astoundingly large that we are tempted to believe that 
we have a direct and dependent cause-and-effect relationship. As a 
matter of fact 

X — the total salaries paid school superintendents and teachers in 
millions of dollars 

and 

Y = the total consumption of wines and liquors in the United 
States in ten million gallons 
for the given years. 

This illustration shows almost perfect correlation, yet no one 
believes that the consumption of wines and liquors increased neces- 
sarily because teachers’ salaries were increasing, nor that teachers’ 
salaries were increasing necessarily because more wines and liquors 
were being consumed. 

A high coefficient of correlation proves a close linear mathematical 
relationship between the two variables. It proves nothing more. It 

1 The data are from Statistical Abstract of the United States, 1918, pp. 830 
and 835. 



CORRELATION AND CAUSATION 


269 


suggests the probability of a cause-and-effect relationship between 
the two variables, but the investigator must search further for the 
explanation. Measurement of correlation is one part of the problem; 
interpretation of the results is a more difficult part of the problem. 1 

Before the subject of statistical analysis had reached its present 
development, John Stuart Mill stated in his Logic : 

Whatever phenomenon varies in any manner whenever another phe- 
nomenon varies in some particular manner, is either a cause or an effect 
of that phenomenon, or is connected with it through some fact of causation . 2 

The suggestion in the last clause of MilPs statement may assist 
us in explaining the above paradoxical relationship between teachers’ 
salaries and the consumption of wines and liquors. The period from 
1870 to 1910 was one of rapid development in the United States. 
Population increased rapidly; foreign and domestic commerce, 
agriculture, and the manufacturing industries grew by leaps and 
bounds. The total amounts paid for the salaries of school superin- 
tendents and school-teachers and the total amount of wines and 
liquors consumed merely kept step with the development in other 
lines. As a matter of fact, we are not at all astonished that the two 
do show a surprisingly large coefficient. We term such correlation 
“ spurious.” 

In the interpretation of the coefficient of correlation it is better 
not to consider it as a measure of causal dependence but rather to 
consider it as a mathematical expression for the degree of association 
between the factors. In this regard Professor Chaddock says, 

Therefore, we no longer search for cause and effect relations as fixed and 
unvarying laws. Association or correlation between occurrences tends to 
replace the older idea of causation in scientific investigation. We have 
seen that variation is a universal characteristic of phenomena. We can 
secure relative likeness in phenomena by a process of classification which 
places similar things together and disregards minor variations. The problem 
of science is to find out how the variation in one group of facts is associated 
with or contingent upon the variation in other groups, and to measure the 
degree of the association. 

The aim is to find the series of facts which are most closely correlated in 
order to enable the investigator to predict future experience. Causation 

1 See Rietz and others, op. cit ., p. 138. 

* Book III, Chap. VIII, Sect. 6. (Italics my own.) 



270 SIMPLE CORRELATION 

becomes a descriptive concept reached by statistical processes applied to the facts 
of experience } 

As a final word we wish to reemphasize that the preceding chapter, 
Linear Trends, and this chapter, Simple Correlation, have been 
concerned with the problem of expressing the relationship between 
the sets of data by means of linear regression. We have assumed Y 
to be a linear function of a single independent variable X — or X to 
be a linear function of Y. The close restrictions imposed necessarily 
limit the range of application of the method. As an illustration, in 
considering the problem of July rainfall in Ohio and its effect upon 
the yield of corn in that state — Exercise 3, page 243 — the thought- 
ful student must have wondered about the effect of other natural 
causes, such as the rainfall for May, the rainfall for June, the tempera- 
tures for May, June, July, and August. And well he may wonder. 
The yield of corn may be considered as a function (or effect) of the 
several variables (or causes) mentioned. A study of problems of this 
character in which the dependent variable is a linear function of 
several independent variables belongs to the subject of multiple correla- 
tion , whereas problems in which the dependent variable is a linear 
function of a single independent variable belong to the subject of simple 
correlation. The subject of multiple correlation is treated in Chap- 
ter 9. If the reader desires he may, without loss of continuity, begin 
its study now; or he may defer it. 

Further, we may consider that the relationship between the de- 
pendent variable and the single independent variable can be described 
by some simple curve other than a straight line. Such correlation 
based upon curvilinear regression will be considered in Chapter 10. 

EXERCISES 

1. For the Water Depth-Alfalfa Yield data of Exercise 2, page 243, the 
following is a summary: 

M x = 33.75 inches M Y = 7.25 tons m = 0.075 

<Tx — 14.98 inches <j Y = 1.26 tons r = 0.89 

(1) Find the equation of the regression line of F on X. 

(2) Is the value of r sufficiently large to warrant confidence in the re- 
gression line for purposes of estimation? 

(3) Find Y in (1) if X =* 40. 

(4) Find S v and interpret your result for the value found in (3). 

1 R. E. Chaddock, Principles and Methods in Statistics , 1925, p. 250. 



CORRELATION AND CAUSATION 


271 


2. Find the correlation of the yield of a plant of oats with the number of 
kernels per plant for the data of the accompanying table. 

X = the number of kernels per plant. Y = the yield in grams. 


Kernels per Plant 1 



3. The following table is a correlation table for the lengths and the 
breadths of 60 leaves. X = breadths and Y = lengths, in millimeters. 2 


Breadths 



Find r and the regression lines for the data. 

4 . Find r for the Savings Bank Deposits-Strikes and Lockouts data of 
page 236. Is this value of r sufficiently large to warrant your using with 
confidence the regression equations for purposes of estimation? 

5. As in Exercise 4 above, treat the Value of Crops-Value of Land 
(Illinois) data of page 236. 

6. Similarly, treat the Value of Crops-Value of Land (Iowa) data of 
page 237. 

1 The data are from A. S. Gale and C. W. Watkeys, Elementary Function# and 
Applications, 1920, p. 432. 

2 Gavett, First Course in Statistical Method , p. 234. 








272 


SIMPLE CORRELATION 


7 . The following correlation table gives the scores of 104 freshmen at 
Georgetown College. X = scores in mathematics. Y = scores in intelli- 
gence. 


Scores in Intelligence and Mathematics Tests of 104 Students 

Mathematics 


\ X 
Y \ 

2.5 

7.5 

12.5 

17.5 

22.5 

27.5 

32.5 

37.5 

42.5 

47.5 

52.5 

57.5 

145 







1 



1 

1 

1 

135 





i 


2 

2 


1 

1 

1 

125 




1 

5 


2 

3 

4 

1 

1 

1 

115 


1 

2 

1 

5 

1 

3 

2 

2 

1 


1 

105 


3 

1 

4 

1 

4 

2 

4 





95 


4 

2 

2 

4 

2 

3 


1 




85 

1 


3 

3 

2 

1 







75 


1 

2 

1 

1 

1 







65 




2 










Find r and the regression lines for the data. 

8. In an investigation of the resemblance of fathers and sons with 
respect to stature, the following summary was obtained: 


Stature of fathers 
X 


M x = 67.7 inches 
ax = 3.21 inches 


Stature of sons 
Y 

M y = 68.7 inches 
a Y = 2.71 inches 


r - 0.51 


What is the most probable height of the sons of a group of selected 
fathers whose mean height is 6 feet? Discuss the reliability of this estimate 
by means of S v . 


9. Are the following correlations positive or negative? 

(1) The speed of an auto and the distance required to bring the car 
to rest when the brakes are applied. 

(2) Age of applicants for life insurance and cost of insurance. 

(3) Age of an automobile and its trade-in value. 

(4) Family income and cost of the family car. 

(5) Marriage rate and index of unemployment. 

(6) Age and blood pressure. 

(7) Age of husbands and age of wives. 

(8) Index of unemployment and amount of goods purchased. 

(9) The soot content in the air at Pittsburgh and the production of 
pig iron. 




CORRELATION AND CAUSATION 


273 


(10) Total production of wheat and the average farm price per busheL 

(11) Per cent illiteracy and the per cent foreign population in the 
counties in Pennsylvania. 

(12) Crime, as measured by the number of indictable offences tried, 
and the index of unemployment. 

(13) Value of crops per acre and the value of land per acre in Illinois. 

(14) Amount of savings deposits and the number of strikes and lock-outs 
in the United States. 

(15) Number of hogs slaughtered per month at Chicago and monthly 
price of pork at Chicago. 

(16) Marriage rate and the index of industrial activity. (See Groves 
and Ogburn: American Marriage and Family Relationships , Chap- 
ter XVIII.) 

(17) Scholarship and success in life. (See Gifford: “ Does Business 
Want Scholars? ” Harpers , May, 1928.) 

10 . In the following table (F. C. Mills: Statistical Methods y p. 381) 

X = Federal Reserve Banks’ Discount Rates (per cent). 

Y = Commercial Banks’ Discount Rates (per cent). 


Federal Reserve Banks’ Discount Rates (per cent) 




(1) Choose ( h f k ) at (5.50, 6.50), compute r and the regression of Y on X . 

(2) Find the estimated value of Y if X = 5.00. 

(3) Compute the arithmetic mean of the F-array for X = 5.00 and 
compare with the value found in (2). 

(4) Find the regression equation of X on Y. 

(5) Find the estimated value of X if Y = 7.00. 

(6) Find the arithmetic mean of the X -array for Y = 7.00 and compare 
with the value found in (4). 



274 SIMPLE CORRELATION 

(7) Find S 9 and S x for the estimated values in (2) and (5) and interpret, 
them. 

11 . The data in the following table were taken from the Handbook of 
Labor Statistics , 1936 Edition, pages 132 and 673. 

X = Index of Wholesale Prices in the United States. (U.S. Dept, of 
Labor. Monthly Average, 1926 = 100.) 

Y = General Index of Employment. (U.S. Dept, of Labor. 3-year 
average, 1923-1925 = 100.) 

Compute r. 


Year 

X 

Y 

Year 

X 

Y 

Year 

X 

Y 

1919 

139 

107 

1925 

104 

99 

1931 

73 

77 

1920 

154 

108 

1926 

100 

101 

1932 

65 

64 

1921 

98 

82 

1927 

95 

99 

1933 

66 

69 

1922 

97 

91 

1928 

97 

99 

1934 

75 

79 

1923 

101 

104 

1929 

95 

105 

1935 

80 

82 

1924 

98 

97 

1930 

86 

92 





12. The following data are taken from the Yearbook of Agriculture , 
1935, pp. 363-364. 

X = supply of wheat in the U.S., July 1. 

Y — price of wheat at Chicago. 



Supply 

Price 


Supply 

Price 

Year 

{ million bu.) 

{cents) 

Year 

{million bu.) 

{cents) 


X 

Y 


X 

Y 

1919 

77 

227 

1927 

122 

138 

1920 

145 

216 

1928 

124 

117 

1921 

1922 

1923 

126 

114 

137 

128 

113 

106 

1929 

1930 

1931 

247 

303 

326 

130 

84 

53 

1924 

144 

139 

1932 

385 

53 

1925 

115 

161 

1933 

393 

94 

1926 

105 

140 





(1) Draw a chart for these data similar to Chart 6, p. 48. 

(2) Compute r and interpret it. 

(3) Compute m and interpret it. 

(4) Write the equation of the regression line of Y on X . 

(5) Find the estimated values of Y if X = 100, 200, and 300. 

(6) Find S v of the estimates, and interpret. 

13. The following table gives the average number of kernels per culm 
per oat plant and the average height of the oat plants (Love-Leighty). 
Find r. 



CORRELATION AND CAUSATION 
Number op Kernels 


275 


\ x 

1 x 

35 

45 

55 

65 

75 

85 

95 

105 

115 

125 

m 

87.5 








3 

2 

2 

7 

82.5 





1 

12 

26 

23 

9 

2 

73 

77.5 




2 

16 

40 

38 

23 

3 


122 

72.5 



1 

13 

30 

59 

32 

5 



140 

67.5 



‘ 7 

22 

9 

6 

1 




45 

62.5 


4 

7 








11 

57.5 

1 


1 








2 

/(*) 

1 

4 

16 

37 

56 

117 

97 

54 

14 

j 

4 

400 


14. In the following table: 

X = price per bushel in cents received by producers December 1 for 
corn 

Y = price per bushel in cents received by producers December 1 for 
wheat 


Find r and discuss its significance. Would you say this correlation is 
spurious? 


Price of Corn and Price of Wheat in the United States , 1 1909-1928 


Year 

X 

Y 

Year 

X 

Y 

1909 

58.6 

98.4 

1919 

134.5 

214.9 

1910 

48.0 

88.3 

1920 

67.0 

143.7 

1911 

61.8 

87.4 

1921 

42.3 

92.6 

1912 

48.7 

76.0 

1922 

65.8 

100.7 

1913 

69.1 

79.9 

1923 

72.6 

92.3 

1914 

64.4 

98.6 

1924 

98.2 

129.9 

1915 

57.5 

91.9 

1925 

67.4 

141.6 

1916 

88.9 

160.3 

1926 

64.2 

119.8 

1917 

127.9 

200.8 

1927 

72.3 

111.5 

1918 

136.5 

204.2 

1928 

75.1 

97.2 


15. If X Income in dollars per capita in Texas in 1932, 

Y = Retail sales in dollars per capita in Texas in 1932, 
r — 0.875, and m =** 0.746, 

(1) Comment on the estimative value of the line of regression 
Y = 0.746X + 8.33. 

1 The data are from Yearbook of Agriculture , 1928, pp. 670 and 702. 




276 


SIMPLE CORRELATION 


(2) If X = $175, compute Y, and compare with the observed value, 
$133. 

(3) If X increases $1.00, what is the expected change in Y? 

16 . In the following table: 

X — scores of 32 students on the Bucknell test in intermediate algebra. 
Y = scores of the same students on a standardized test in intermediate 
algebra. 

Z = the semester grades of the same students in intermediate algebra. 


X 

Y 

Z 

X 

Y 

Z 

X 

Y 

Z 

X 

Y 

Z 

54 

56 

67 

90 

94 

91 

27 

46 

35 

88 

95 

90 

55 

64 

67 

63 

79 

77 

78 

54 

76 

72 

70 

82 

64 

67 

74 

43 

56 

60 

10 

19 

20 

55 

59 

61 

33 

43 

48 

69 

48 

70 

49 

39 

60 

61 

68 

76 

57 

55 

60 

47 

48 

52 

46 

58 

50 

33 

52 

50 

42 

59 

60 

62 

59 

67 

70 

41 

62 

65 

45 

65 

88 

84 

81 

92 

64 

90 

45 

37 

50 

84 

82 

88 

85 

84 

86 

75 

68 

85 

95 

99 

92 

55 

60 

52 


Verify the following analysis : 

M x = 61 M Y = 61 M z = 67 

(j x — 20.4 a y ~ 17.9 or z — 17.0 

r X Y = 0.78 r X z = 0.94 r YZ = 0.84 

Which test was given the greater weight in the determination of the stu- 
dents 7 semester grades? 

17 . The following data are taken from the 1935 World Almanac , pp. 479, 
499. 

Column II gives the average attendance (in thousands) in New York 
City schools for the given years. 

Column III gives the number (in thousands) arraigned before the 
Magistrates Courts in New York City in the same years. 

Find pxy or r X y (rank) for these data. Would you say that this correla- 
tion is spurious? Explain. 


Year 

Col II 

Col III 

Year 

Col. II 

Col III 

1918 

700 

202 

1923 

853 

420 

1919 

712 

282 

1924 

870 

455 

1920 

736 

355 

1925 

891 

440 

1921 

779 

367 

1926 

910 

437 

1922 

814 

434 

1927 

926 

527 



Chapter 9 

MULTIPLE CORRELATION 

70. PRELIMINARY EXPLANATION 

Our previous work in correlation has been concerned with problems 
involving only two variables, an independent variable X and a de- 
pendent variable Y. Such correlation is called “bivariate.” It is 
obvious that many types of phenomena are affected by more than 
one factor and that the variations in the dependent variable may 
be due to the interaction of many forces. 

In bivariate correlation we measure the relationship between the 
dependent variable Y and a single independent variable X , com- 
pletely ignoring the influence upon Y of other forces that may be 
just as potent as X. Thus, on page 243 we measured the influence 
of July rainfall X upon the production of corn in Ohio Y. We found r 
to be 0.G1 which shows that July rainfall does exert a significant 
influence upon the production of corn. But we may wonder if it 
exerts a greater influence than June rainfall or June temperature or 
July temperature. We are thus aware that the production of corn 
may be dependent upon several variables, and a consideration of the 
production in this regard would present a problem in multiple correla- 
tion. Multiple correlation is then concerned with the combined influence 
of several independent variables upon a single dependent variable. 

As another illustration, suppose we have the scores made by a 
group of students on objective tests in English, Mathematics, and 
Intelligence. By means of simple correlation we can measure the 
relationship between the scores in Intelligence and those in Mathe- 
matics, between the scores in Intelligence and those in English, and 
between the scores in English and the scores in Mathematics. What 
we now need is a method of combining two factors, say English and 
Mathematics, in order that an estimate may be made of their in- 
fluence in combination upon the third factor, Intelligence. 

The method of procedure by which this may be accomplished is 
similar to that used in simple correlation. 

277 



278 


MULTIPLE CORRELATION 


71. THE CASE OF THREE VARIABLES 

Let us assume that X h X 2 , X 3 are three variable quantities which 
represent three interacting forces. Any one variable may be con- 
sidered mathematically a function of the other two. As in the case 
of bivariate correlation, we shall assume that the relationships are 
linear, that is, that the N observed points representing the N observed 
sets of data are distributed about the plane 

Xi = i>i2-XT 2 + 6 i 3 X 3 + c (l) 

in which X 2 and X 3 are independent variables and Xi is the de- 
pendent variable. 1 

We shall determine the constants in accordance with the Principle 
of Least Squares: The plane best fitting a set of points is that one 
in Which the constants are so determined that the sum of the squares 
of the Xi-reSiduals is a minimum. 

An Xi-residual is defined by the equation 

P = X i — (6 12 X 2 + 613X3 + c) ( 2 ) 

We shall determine 612, 613, and c so that 

2 p 2 = 2 [Xi - (612X2 + 613X3 + c)J ( 3 ) 

shall be a minimum. The conditions for this are that the first partial 
derivatives of 2 p 9 with respect to c, 6 i2 , and 613 shall be equal to Zero. 
Equating to zero these derivatives, we obtain the norrhal equations 

6 i 2 2 X 2 + 6^2X3 + Nc = 2 Xi ] 

6 i 2 2 X 1 + 6 i 8 2 X 2 X 3 + c 2 X 2 = 2 X*X 2 \ ( 4 ) 

6122X2X3 + 6 13 2 X 1 + c2X 8 *= 2X1X3 J 

from which, by simultaneous solution, the values of 613, b iZ , and c may 
be determined in terms of the observed values Xi, Xa, X*. 

Thus, suppose we wish to find the plane 

Xi = 6 i 2 X 2 + 613X3 + c 

that best fits the ten points (Xj, X 9 , X*) given in Table 63 . 

1 The first subscript affixed to the regression coefficient bn will be the sub- 
script of the letter X on the left (the dependent variable), and the second Will 
be the subscript of the X to which it Is attached. 



THE- CASE OF THREE VARIABLES 
Table 63 


279 


X , 

x t 

X, 

XI 

XI 

X? 

x,x 2 

XiX, 

x 2 x» 

2 

2 

11 

4 

4 

121 

22 

22 

4 

3 

4 

17 

9 

16 

289 

68 

51 

12 

4 

6 

26 

16 

36 

676 

156 

104 

24 

5 

5 

28 

25 

25 

784 

140 

140 

25 

6 

8 

31 

36 

64 

961 

248 

186 

48 

7 

7 

35 

49 

49 

1,225 

245 

245 

49 

9 

10 

41 

81 

100 

1,681 

410 

369 

90 

10 

11 

49 

100 

121 

2,401 

539 

490 

110 

11 

13 

63 

121 

169 

3,969 

819 

693 

143 

13 

14 

69 

169 

196 

4,761 

966 

897 

182 

0 II 

80 

M 2 = 8 

370 

Mi = 37 

610 

780 

16,868 

3,613 

3,197 

687 


We complete the table to find the 2 functions that we need in 
the normal equations (4). Substituting in (4) we obtain 

806i2 + 70&i3 + 10c = 370 
780&12 + 687&13 + 80c = 3613 
G87&12 + 6106 13 + 70c = 3197 

To solve these equations we divide each equation by the coefficient 
of 6 ia of that equation. We obtain 

b n + .8756i3 + .125c = 4.625 
bn + .88I613 d" .103c = 4.632 
bn + .8886 13 + .102c - 4.654 

Next we subtract the first equation from the second and the 
second equation from the third. We obtain 
.0066 13 - .022c = .007 
.0076i3 - .001c = .022 
or, multiplying by 1,000 

6613 22c = 7 

7 613 ~ c — 22 

Solving these equations and substituting we obtain 612 = 1.735, 
613 = 3.223, c = 0.561. The equation of the best fitting plane is 

X x = 1.735X2 + 3.223X S + 0.561 

Exercise. Show that the point (Mi, M M 3 ) = (37, 8, 7) is on this plane. 



280 


MULTIPLE CORRELATION 


We may test the goodness-of-fit of the plane by finding the com- 
puted values of Xi for the given values of X 2 and X z , and the X r 
residuals. The computed values of X h the (AVresiduals), and the 
(Xi-residuals) 2 are shown in Table 64. 


Table 64 


x, 


Xx 

Computed 

Xx 

Xi -residuals 
P 

(X\ -residuals)* 

P 2 

2 

2 

11 

10.477 

+ 0.523 

.274 

3 

4 

17 

17.170 

- 0.170 

.029 

4 

6 

26 

23.863 

+ 2.137 

4.567 

5 

5 

28 

25.351 

+ 2.649 

7.017 

6 

8 

31 

33.779 

- 2.779 

7.723 

7 

7 

35 

35.267 

- 0.267 

.071 

9 

10 

41 

46.918 

- 5.918 

35.023 

10 

11 

49 

51.876 

- 2.876 

8.271 

11 

13 

63 

58.569 

+ 4.431 

19.634 

13 

14 

69 

66.750 

+ 2.250 

5.062 

; 




- 0.020 

87.671 - 2p 2 


We note that five points are above the plane and five points are 
below it, and that the sum of the residuals is essentially zero. The 
sum of the squares of the Xi-residuals, 2p 2 , plays a role in multiple 
correlation similar to that played by 2 p 2 in simple correlation. 
(See p. 233.) It assists us in finding the standard error of estimate, 
$k 23 ) . As we did in simple correlation, we define the standard error 
of estimate by the equation 1 

^1(23) y jy 

This is a quantity which, when combined with the computed value of 
Xi, makes possible our measuring the confidence or the reliability 
we may place in values of Xi estimated from the equation for given 
values of X 2 and X 3 . Thus, the odds are 2 to 1 that, for given 
values of X 2 and X 3 , the observed X\ will lie within the interval 

(computed Xi) ± Si m 

1 The subscript before the parenthesis designates the variable estimated (the 
dependent variable) and the subscripts within the parentheses designate the 
variables from which the estimate has been made. 



THE CASE OF THREE VARIABLES 281 

Similarly, the odds are 19 to 1 that the observed Xi will lie within 
(computed Xi) ± 2<Si ( m) 

and 385 to 1 that the observed Xi will lie within 
(computed Xi) ± 3jSi ( 23) 

For the problem we are considering 

Su* n = S/W = 2,689 

It will be noted that only two of the ten points have residuals nu- 
merically larger than 2.689, and only one point has a residual numer- 
ically larger than 2(2.689). 

In a later section we will discuss the coefficient of multiple correlation 
which is an expression that measures the degree of the relation 
between a single dependent variable, say Xi, and several inde- 
pendent variables, A% and X 3 , in combination. We shall show that 
this coefficient /ii<23) may be found from the formula 

where Ci means cr Xi . 

From Table 63 we find 

< 7 ! = \J~p- - M\ = _ 372 = 17 .83 

Hence 

/e 1(23 ) = \J 1 - = Vl - .0275 = V79725 = 0.986 

Thus, we have completed the analysis of the data of Table 63. 
This analysis has included finding (1) the best fitting plane, (2) the 
standard error of estimate, and (3) the coefficient of multiple corre- 
lation between Xi and (X 2 and X 3 ) in combination. 

EXERCISES 

1. For the values of 612, &13, and c determined by (4), show that 

(a) the algebraical sum of the Xi-residuals is equal to zero, and that 

(b) the point (Mi, M 2 , M s ) is on (1). 



282 


MULTIPLE CORRELATION 


Note. The quantities M h M 2} and M 3 are the means of the variables 
Xi, X 2 , and X% respectively. 


Xx 

X, 

X, 

10 

2 

5 

15 

4 

7 

17 

6 

8 

19 

8 

9 

25 

9 

12 

22 

10 

10 

26 

11 

13 

31 

12 

15 

30 

13 

14 

35 

15 

17 


(1) Find the regression equation for these data 
with Xi as dependent upon Xt and X % . 

(2) Find the computed values of Xi for the given 
values of X 2 and X z . 

(3) Find the Xi-residuals. 

(4) Find <Si (2 3) and jRi ( 2 3). 

(5) How many of the points are within (Xi com- 
puted) db Si (23)? 


72. THE CASE OF THREE VARIABLES CONTINUED 
Secondary Explanation 


The method employed in the preceding section is satisfactory when 
the number, N , of sets of values is small, say less than forty. When 
N is large, as it usually is, we need a more systematic procedure. 
Further, the development of a theory in terms of the original variates, 
Xi, X 2 , and X z is rather complex and tedious. 

A simpler and more elegant procedure is to show that the centroidal 
point (Mi, M 2 , Ms) is on the best-fitting plane, then to transform our 
variates to this centroidal point as origin. (We shall indicate the 
means of the variables X h X 2 , and X z by Mi, M 2 , and M z respectively, 
and their standard deviations by <r h cr 2 , and <r 3 .) We shall prove 
that (Mi, M 2 , M 3 ) satisfies equation (1) for the values of 612, 613, 
and c determined by equations (4). 

If the first of equations (4) be divided by N we have 


or 



+ b\3 


2Z* 

N 


+ C 


SXx 

N 


&12M2 4" 613M3 + c — Mi — 0 


which is the condition that (Mi, M 2 , M 8 ) is on (1). 

We now translate our data to the centroidal point as origin and 
take the equation of the plane through this point to be 

xi = 612^2 + &i 3 a? 3 



THE CASE OF THREE VARIABLES CONTINUED 283 


where 

X\ = X\ — Mi, x 2 *» X 2 — M 2 , x 3 = X 3 — M s 
For this form of the regression plane any ^residual is given by 
p = Xi - (b 12 x 2 + 613 X 3 ) 

and by equating to zero the first partial derivatives of 

2p 2 = 2[xi — ( 612 X 2 + 613 X 3)] 2 (5) 

with respect to 612 and 6 i 3; we obtain the normal equations 

bu%xl + bi 3 Xx 2 X3 = Sxix 2 1 , v 

6i 2 2x 2 x 3 + b^Xxl = 2xix 3 j ^ 

Let <Tj be the standard deviation of the N values of Xj, and let 
r pq be the correlation coefficient of the N given pairs of values of 
X p and X q . Thus 2x 2 = No\ y 2x 3 = Nol, 2xix 2 = Noi0 2 ri 2y 2xix 3 
= N(T\(Tzr\z y 2x 2 X 3 = N(T 20^23. 

By expressing the summations in terms of the standard deviations 
and correlation coefficients, the normal equations ( 6 ) after simplifi- 
cation become 

6i 2 0’ 2 + 6i 3 ct 3 r 23 = <riri 2 
b \ 2 o 2 r 2 3 + 613(73 = & iT 13 



Solving the normal equations (7) we obtain the regression co- 
efficients 


612 — 
61s - 


rn — r 13 r 23 

1 - r\z 


713 — 7i 2 r 2 3 


1 - rl 


2a 


<Ti 

(7 2 

<Tl 

CFz 


( 8 ) 


and the regression plane is thus 

- (1 - r\ 3 ) = ^ (r 12 - r 13 r 23 ) + ~ (r 13 - r 12 r 23 ) (9) 

<7i (7 2 (7 3 

In terms of the original variates X\, X t , X 3 the equation of the re- 
gression plane is 

aw 

(7i (7 2 0 3 



284 


MULTIPLE CORRELATION 


Equation (10) gives the most probable value of Xi for assigned 
values of X 2 and X 3 . Analogous equations may be written with X 2 
and Xz as the dependent variables by cyclically 
permuting the subscripts 1, 2, and 3; that is, re- 
placing 1 by 2, 2 by 3, and 3 by 1, as if one were 
going around the circle in the direction indicated 
by the figure. 

We have thus reached a result which gives an effective summary 
of the manner in which X 2 and Xz in combination affect X\. Further, 
it is delightful to observe that this summarizing equation involves 
nothing more complicated than simple correlation coefficients. 



X, 

X, 

x s 

2 

26 

1 

4 

20 

2 

6 

20 

3 

9 

17 

4 

5 

7 

5 

5 

5 

6 

11 

3 

7 


EXERCISES 

(1) Verify the following: 

M x = 6 M 2 = 14 M 3 = 4 

<j\ ~ 2.828 • c 2 — 8.246 <73 — 2 

r 12 = - 0.551 r 13 = 0.707 r 23 = - 0.970 

(2) Find the regression plane with Xi as depend- 
ent on X 2 and X 3 . 

(3) Find Ri( 23 ) and $1(23). 


Xi 

x 2 

Xs 

5 

4 

5 

4 

5 

2 

5 

6 

4 

6 

4 

9 

9 

5 

8 

10 

6 

4 

9 

6 

10 

12 

7 

11 

11 

9 

10 

9 

8 

7 


(1) Verify the following: 

Mx = 8 M 2 = 6 M 3 = 7 

(Tx = 2.646 <72 = 1.549 a 3 = 2.933 

t 12 — .683 t i3 — .696 r 23 = .374 

(2) Find the regression plane with X\ as depend- 
ent on X 2 and A" 3 . 

(3) Find the computed values of X x and the X r 
residuals. 

(4) Find R\( 2 z) and ^1 ( 23 ). 

(5) How many of the points are within (Xx com- 
puted) ± $1(23)? 


3. In the following table 

Xx = the semester grades of 32 students in intermediate algebra 
X 2 — the scores of the same students on a standardized test in inter- 
mediate algebra 

Xz = the scores of the same students on the Bucknell test in intermediate 
algebra 



THE CASE OF THREE VARIABLES CONTINUED 285 


X, 

x, 

X, 

x s 

x 2 

Xi 

X, 

x 2 

Xj 

X, 

x 2 

X, 

54 

56 

67 

90 

94 

91 

27 

46 

35 

88 

95 

90 

55 

64 

67 

63 

79 

77 

78 

54 

76 

72 

70 

82 

64 

67 

74 

43 

56 

60 

10 

19 

20 

55 

59 

61 

33 

43 

48 

69 

48 

70 

49 

39 

60 

61 

68 

76 

57 

55 

60 

47 

48 

52 

46 

58 

50 

33 

52 

50 

42 

59 

60 

62 

,59 

67 

70 

41 

62 

65 

45 

65 

88 

84 

81 

92 

64 

90 

45 

37 

50 

84 

82 

88 

85 

84 

86 

75 

68 

85 

95 

99 

92 

55 

60 

52 


( 1 ) Verify the following values: 

Mi = 67 Mi = 61 M 3 = 61 

a i = 17.0 c 2 = 17.9 (x 3 = 20.4 

t 12 = 0.84 7*13 = 0.94 7*23 == 0.78 

(2) Find R H2 i) and S {m) . 

(3) Find the equation of the regression plane with Xi dependent upon 
X 2 and X 3 . Show that the point (M h M 2t M z ) is on this plane. 

(4) What meaning do you attach to the values of &i 2 and &i 3 ? 

(5) Estimate X x for X 2 = 84 and X 3 = 81. Use your value of <Si (2 3 ) 
to interpret this estimate. 

4 . The following table gives a summary of the fundamental statistical 
constants that were obtained from scores made on objective tests in 
English, Mathematics, and Intelligence by 343 Bucknell freshmen. 

(1) Find the equation for the regression of Intelligence on English and 
Mathematics. 

(2) What is the estimated Intelligence score for an individual whose 
English score was 172 and whose Mathematics score was 40? 

(3) What is the estimated Mathematics score of an individual whose 
English score was 160 and whose Intelligence score was 150? 


Fundamental Constants from 343 Scores in the Tests Given 
in English, Mathematics, and Intelligence 



English 

M athematics 

Intelligence 

CG 

C 

.o 

English 

1.00 

0.30 

0.65 


Mathematics 

0.30 

1.00 

0.46 

fc 

a 

Intelligence 

0.65 

0.46 

1.00 

Arithmetic Means 

151 

34 

140 

Standard Deviations 

44 

12 

45 



2 m 


MULTIPLE CORRELATION 


7S. COEFFICIENT OF MULTIPLE CORRELATION 
Three Variables 

It is evident that the value of equation ( 9 ) or ( 10 ) as a tool for 
purposes of estimation depends upon the closeness of fit of the 
plane to the points. As was suggested in the preceding section we 
shall use for measuring the goodness of fit of the plane to the points 
the standard error of estimate, Sun), 



where 2 p 2 is determined from the values of 612 and 6 i 3 in ( 7 ) or (8). 
From ( 5 ) 

2p 2 = 2[>i - (612X2 + 6 13 x 3 )] 2 

= 2x l + b \^ x \ + &i 3 2x 3 — 26i 2 2xix 2 — 26i 3 2xix 8 + 26i 2 6i 3 2x 2 x 3 
which may be written in the form 
2p 2 = N[_a\ + b\ 2 (r\ + b\ z <r\ - 26i 2 o'iO' 2 ri2 - 26i 3 <ri<r 3 ri 3 

+ 26i 2 6i 3 0'20 r 3 r 2 3] (11) 

We desire the value of 2 p 2 for the values of 6 i2 and 613 given by ( 7 ) 
or (8). This may be easily found by multiplying the normal equa- 
tions (7) by 612^2 and 6i 3 <r 3 respectively, adding the results, and sub- 
stituting the results in ( 11 ). The value for 2 p 2 then becomes 

2 p 2 = AT[o y\ - 6120 * 10 - 2^2 - 6 i 3 o'i(r 3 ri 3 ] (12) 

If now the values of 612 and 6 i3 given by (8) are substituted, we have 
02 - 2 fi r ?2 + r\ z - 2ri 2 ri 3 r 23 l /1oN 

Sl m = ^i[l J (13) 

or 

$1(23) “ O' 1 V 1 — R\(2S) (14) 

where 

t > _ 4 Ai2 + rf 3 - 2ri 2 ri 3 r 2 3 /tcN 

•n^K 23 ) * y | _ r 2 g Uuj 

is the “coefficient of multiple correlation” of Xi on X 2 and X 3 . 

By permuting the subscripts we may write down the values of 
R 2( i3) and # 3 (i 2 ). Due to the fact that we have no mathematical 
method of attaching a meaning to the algebraical sign of #i< 2 3), it 
is customary to write it without sign. 



COEFFICIENT OF MULTIPLE CORRELATION 287 


From (13) we may note that since &? ( 2 3 ) is a positive quantity, 
0 ^ #| ( 23 ) ^ 1. When #i ( 2 3 ) is equal to unity numerically, that is, 
when 

1*12 + Az + 4* 2 r 12 ri 3 r 2 3 “ 1 

we have perfect multiple correlation. In this case the points are 
on the regression plane. 

The coefficient of multiple correlation is an expression which 
measures the degree of relationship between a single dependent 
variable and a number of independent variables in combination. 
It is more accurately defined as the ordinary cross-product coefficient 
of correlation between the Xi estimated by (10) and the observed 
X\, or between xi estimated by (9) and the observed x\. (See Ex- 
ercise 1 of the next list of exercises.) 


EXERCISES 

1. (a) The estimated value of x h say Xi e , may be found from x u = 6i 2 x 2 
+ 6130:3 where 612 and 613 are given by ( 7 ) or (8). Show that the standard 
deviation (t u of x ie is given by cr 2 u = 6i 2 o‘iO’2ri2 + 6i 3 <ri0Vi3. 

Hint: Use <j\ t *= ~~ — an( ^ equations ( 7 ). 

(b) Show that /Sj ( 23) = <r\ — <ri e . 

Hint: Use equation (12) and (a). 


(c) Show that R\m) — — • 

Cl 

(d) Show that 2 xiXi e = Na 2 ie . 

Hint: Multiply the value of Xu in (a) by xi, and sum. Change the 
2 quantities on the right-hand side into statistical symbols. 


(e) Show that n u — 


<Tle 

01 


(f) Show that #1(23) = n i«. 

biiZ/XiXt + 6132X1X3 


2 . Show that #i<23> = 


2x; 


3 . Show that #? (23 ) = — [>1261203 + ris6i808j. 

<?i 



288 MULTIPLE CORRELATION 

4 . Show that, for the least-squares plane, the algebraical sum of the 
residuals is zero. 

6. State three important properties of the least-squares plane fitting a 
set of N points. 

6. (Davies and Crowder.) The following table gives the rankings of the 
specified states in 1860. 

You can save labor by using the values of 2X and 2X 2 given on pages 9 
and 10, or by using (20) page 265. 

Xi = rank of the specified state in notables 
X 2 = rank of the specified state in education 
Xz = rank of the specified state in capital 


State 


Alabama 

Arkansas 

Connecticut 

Delaware 

Florida 

Georgia 

Illinois 

Indiana 

Iowa 

Kentucky 

Louisiana 

Maine 

Maryland 

Massachusetts 

Michigan 


x, 

x 2 

X, 

24 

23 

24 

29 

27 

29 

2 

2 

3 

8 

19 

8 

27 

29 

28 

25 

25 

23 

14 

14 

16 

17 

16 

14 

16 

12 

26 

20 

22 

15 

26 

20 

25 

6 

4 

12 

13 

15 

9 

1 

1 

2 

11 

9 

17 


State 


Mississippi 
Missouri 
New Hampshire 
New Jersey 
New York 
North Carolina 
Ohio 

Pennsylvania 

Rhode Island 

South Carolina 

Tennessee 

Vermont 

Virginia 

Wisconsin 



x 2 

X, 

28 

17 

27 

19 

18 

19 

5 

5 

7 

9 

13 

4 

7 

7 

6 

22 

28 

22 

10 

11 

10 

12 

10 

5 

4 

8 

1 

21 

21 

21 

23 

24 

18 

3 

3 

11 

18 

26 

13 

15 

6 

20 


(1) Verify the values: 


M l = 15 

M 2 = 15 

M t = 15 

<ri = 8.367 

cr 2 = 8.367 

03 = 8.367 

r i2 = 0.867 

t 13 = 0.886 

r n = 0.670 


(2) Find Riw) and Siw 

74. DETERMINANTS 


A. Determinants of the Second Order. 

If we solve the equations 

aixi + biyi = ci 

02X1 + &22/1 = Ql 



DETERMINANTS 


289 


simultaneously, we obtain the solutions: 

__ Ci&2 — C 2 &I 

Xl aj )2 — 0261 

By adopting the shorthand notation 


ai bi 

Cl bi 

— Ui?>2 — U261 


a 2 b2 

C2 62 


U1C2 — a 2 Ci 
(Z162 — d 2 bi 


= C162 — C261 etc. 


we may write the solutions 


Cl 

b 1 

ai 

Cl 

02 

b 2 

a 2 

C 2 

di 

b 1 

y 1 — 

ai 

bi 

&2 

b 2 

a 2 

b 2 


The square arrays defined above are called determinants. Since 
there are two rows and two columns, we call the arrays determinants 
of the second order. The letters a h a 2 , b h b 2 , etc. are called the elements 
of the determinant. The elements a h b 2 constitute the principal 
diagonal of the determinant found in the denominators of X\ and y\. 

We note that the denominators of X\ and y x are the same deter- 
minant, that formed from the coefficients as they stand in the given 
equations. Further, we note that the numerator for x\ may be 
obtained from the denominator by replacing a h a 2 , which are coef- 
ficients of Xi in the given equations, by the terms c h c 2 . Similarly, 
the numerator for yi is the determinant of the denominator with 
b h 62 replaced by c h c 2 respectively. The determinant of the de- 
nominator is called the determinant of the system . 

Example. Solve by determinants: 


Solution: 



X + y = 3 
2 x + 3 y — 1 


CO 


1 3 

GO 

_ 9 " 1 a 

2 1 

1 1 

1 

0 

1 

1 

00 

1 

1 1 

2 3 


2 3 


1 - 6 
3 — 2 



290 


MULTIPLE CORRELATION 


EXERCISES 

Solve the following pairs of equations using determinants: 


1. x + y = 2 
2x + 3y *= 7 
3. 0.3x + 0.2 y = 4.0 
0.7x — • 0.6z/ =» 26.4 


2. £ — 3y = 6 

4x — 5t/ = 24 
4. 4x — 8y = 17 
12 * + 162 / ® - 9 


B. Determinants of the Third Order. The solution of three 
equations in three unknowns is also facilitate.d by the use of de- 
terminants. In this case we have square arrays of three rows and 
three columns or determinants of the third order. The square array 
in the left-hand member of the equality 


a x 

b i 

Cl 

a 2 


c 2 

a 3 

b 3 

Cz 



c 2 

c 3 



c 2 

c 3 


+ Ci 


a 2 

03 


h 2 

b s 


is a determinant of the third order. It is defined in terms of de- 
terminants of the second order as in the right-hand member of the 
above equality which is called the expansion of the determinant. 
The elements a h h 2 , c 3 constitute the principal diagonal. 

The second order determinants in the above equality are called 
minors of the elements ai, hi, Ci respectively. The minor to a x is 
the determinant that remains after crossing out the row and the 
column in which a x lies. Similarly the minor for any other element 
is found. 

The above determinant was expanded according to the elements of 
the first row. We may also expand it according to the elements of 
the first column. Thus, 



a x 

bt 

Cl 

D = 

a 2 

b 2 

C 2 


az 

b 3 

Cz 



c 2 

Cz 



+ a 3 


hi 

h 2 


Ci 

c 2 


It is obvious that the complete development of a determinant of 
the third order has six terms. Thus, 

D = aih 2 c 3 *4“ a 2 h 3 ci -f* & 3 hic 2 — ciih 3 c 2 — azb 2 ci — a 2 b\Cz 



DETERMINANTS 


291 


If we solve by elementary algebra the equation 

aix + b x y + ciz = di 
a 2 x + b 2 y + c 2 z = di 
azx + b z y + c z z = d z 

for x, we obtain 

x — + rf 2 b 3 Ci + dzbiC2 — dibzCj — d z bjCi — d 2 biCz 

dib 2 Cz "b cabzCi -f* a 3 6ic 2 — ttibzCt — d^>2Ci ‘ — tt2&iC3 


The denominator is the development of the determinant D, above, 
and the numerator is the same as the denominator with a* replaced 
by d l} i - 1 , 2, 3 . Hence we can write 



In a similar way we can find y and z : 



dl 

d , 

Cl 


dl 

61 

d , 

di 

di 

Ci 


di 

b 2 

di 

dz 

d. 

Cz 


dz 

h 

dz 




- z — ■ 




di 

bi 

Cl 


di 

bi 

Cl 

di 

hi 

Ci 


di 

bi 

Ci 

dz 

bz 

Cz 


dz 

bz 

Cz 


We note that the denominators of x, y ) and z are the same, the 
determinant of the system. The determinant in the numerator of 
any unknown can be obtained from the denominator by replacing 
the column of the coefficients of this unknown by the corresponding 
known terms, d h d 2 , dz . 

In the expansion of D we note that the sign preceding the minor 
of a\ is plus, that preceding the minor of a 2 is minus, that preceding 
the minor of a 3 is plus. The sign preceding a minor corresponding 
to an element is easy to remember. Consider an element in the 
h - row and fc-column. If ( h + k) is even the sign prefixed to the 
minor is plus, and if (h + k ) is odd the sign prefixed to the minor 



292 MULTIPLE CORRELATION 

is minus. The mi^or of an element with its sign attached is called 
the co-factor of the element. We note that D is equal to the sum 
of the products of any row (or column) and their respective co- 
factors. 

EXERCISES 
1 3 4 

1 . Evaluate the determinant 2 7 3 by expanding (a) according 

13 5 

to the elements in the first row, and (b) according to the elements in the 
first column. 

Solve for x, y, and z the equations : 

2. x — y — z = — 6 3. x -\~ 2y — 2 = 6 

2x + y + z — 0 2x — y + 3z = — 13 

Sx — by + Sz = 13 3a; — 2y + 3z = 16 ’ 

C. Determinants of Any Order. We defined a determinant of 
the third order in terms of the elements of a row (or column) and 
their minors. Similarly we may define determinants of the fourth 
and higher orders. Thus, the following determinant of the fourth 
order 


ai 

61 

Cl 

di 





&2 

b 2 

C 2 

c?2 b2 

c 2 

c?2 bi 

Ci 

d , 

a 3 

63 

Cs 

dn = Oi bz 

Cz 

dz — a 2 bz 

Cz 

dz 

a 4 

b 4 

C 4 

d 4 b 4 

c 4 

d 4 b 4 

Ci 

d 4 


61 Ci d\ bi Ci di 

“f* U3 62 C2 C?2 — U4 62 C2 C?2 

^4 Ci d 4 bn C3 dz 

is defined in terms of the elements of the first column and their 

minors. The sign preceding a minor of an element in h- row and 
/c-column is plus or minus according as ( h + k) is even or odd. A 
minor of an element with its sign attached is the co-factor of the 
element. The value of a determinant is the sum of the products of 
the elements of a row (or column) and their co-factors. 

Just as we define determinants of the third and fourth orders in 
terms of the elements of a row or column and their co-factors, so we 
define a determinant of any order to be the sum of the products of the 
elements of a row (or column') and their respective co-factors . 



DETERMINANTS 


293 


EXERCISES 

1. Expand the following determinants (a) according to the elements 
of the first row, and (b) according to the elements of the first column. 


2 

4 

- 2 

3 


- 2 

1 

3 

0 

1 

- 2 

1 

0 

(2) 

5 

- 3 

3 

1 

- 2 

0 

- 1 

3 

4 

0 

2 

4 

2 

3 

- 2 

3 


1 

2 

3 

3 


2. The following theorems are true for determinants of any order 
We ask the student to prove them for determinants of the third order. 

(1) If the corresponding rows and columns of D be interchanged, D is 
unchanged in value. 

(2) If any two rows (or columns) of D be interchanged, D becomes — D. 

(3) If any two rows (or columns) be identical, D = 0. 

(4) If each element of a row (or column) of D be multiplied by k , the 
value of the resulting determinant is kD. ' 

(5) If to each element of a row (or column) of D is added k times the 
corresponding element of another row (or column), D is unchanged 
in value. 


75. APPLICATION OF DETERMINANTS 
Three Variables 


The results of the analysis of the foregoing sections on multiple 
correlation can be expressed in very simple forms by the use of 
determinants. 

Let 



7*11 

7*12 

7*13 


1 

7*12 

7*13 

D = 

7*21 

7* 22 

7*23 

= 

7*21 

1 

7*23 


7*31 

7*32 

7*33 


7*31 

7*32 

1 


where r hk is the element in the /i-row and the fc-column. Evidently 
r h h = Thk = 1, and r hk = r kh . 

A minor D hk of the element r hk is the determinant formed by the 
elements that remain after striking out all the coefficients in the row 
and the column common to r^. Thus, for examples, 


Z>n 


r 2 2 
r * 32 


r 23 

r 33 


= 1 - rh 


D\2 


r 2 1 
7*3 1 


r 23 

r 33 


= ri2 - 7 * 137*23 



294 


MULTIPLE CORRELATION 


D13 — 


= r 12 r 23 — ri3 


r 2i r 22 
r 3 i r 32 

A co- factor Ahk of the element r hk is the minor Dhk with the sign that 
would be prefixed to it when the determinant D is expanded. The 
sign that is prefixed to the minor is positive or negative according 
as Qi + k) is even or odd, That is 

A** = (- 1 ) h + k D hk 

Expanding D according to the elements of the first row, we have 
D = rnAi - r 12 D 12 + r i3 Di 3 1 

= riiAn + r\ 2 A\ 2 + r^Au > (16) 

= 1 — rf 2 — r \ 3 ~ r|g + 2ri 2 ri 3 r 23 J 

Now let us solve equations (7) by determinants and note the 
simplicity of the results. 

bi 2 c 2 + buCzr^ = Cirn 
b\ 2 a 2 r 2 3 + 6i 3 o* 3 = Ciriz 

We obtain 


6l2 =f= 


(7) 


6 13 = 


CiTn 

CzT 2 3 


Til 

r 23 

CiTi 3 

C3 

_ 

riz 

1 

(72 

C 3 r 2 3 

0*2 

1 

r 23 

c 2 r 2 3 

Cz 


r 23 

1 

c 2 

Cir X2 

! 

1 

ri2 

c 2 r 23 

Cinz 

= < L 1 . 

r 23 

riz 

c 2 

cr 3 r 23 

Cz 

11 

r 2 z 

c 2 r 23 

C3 


1 r 23 

1 


The regression coefficients in the determinant notation are 
Di2 Cl — A 12 Cl 


b\ 2 = 


bi3 — 


Dn c 2 An c 2 

““ D 13 Cl _ — A 13 Cl 
Cz 


(17) 


Dn C3 An C3 

and the equations of the regression planes (9) and (10) are 

z 3 


-An + —Au + = 0 

Ci C 2 C3 


(Xi — M x ) ^„ + (X, - M,) ^ + V.-M.) Aa , 0 


Cl 


Cl 


C3 


(18) 


and 



APPLICATION OF DETERMINANTS 


295 


or 

3 


A I* — 0 

i= 1 



Applying the determinant notation to equation (13) we get 


Sim - 

(19) 

which, when substituted in (14), leads to 


R im = y / 1 “ 

(20) 

EXERCISES 



1. Analyze the data of Exercise 3, page 284, using determinants. 

2. Analyze the data of Exercise 4, page 285, using determinants. 

3. Analyze the data of Exercise 6, page 288, using determinants. 


76. PARTIAL CORRELATION 

Sometimes a correlation between two factors is due to the influence 
of one or more other factors rather than to any inherent relationship 
between the two themselves . 1 For this reason it is necessary to 
eliminate as far as possible those uncontrolled factors which, through 
their common relation to the variables to be correlated, tend to 
influence unduly the true correlation. This is accomplished by a 
technique known as partial correlation. 

It is desirable, therefore, to obtain the correlation between X\ 
and X 2 , say, when X 3 has a fixed value. For example, we can find 
the correlation between English and Mathematics (p. 285) when 
Intelligence is constant, say 100, but not completely ignored. 

In bivariate correlation, it will be recalled that the values of the 
regression coefficients bu and bn of the regression equations 

X\ — bisXz + C\ and X 2 =a bnX\ -f* Ct 
were found to b 6 * 

612 = ri 2 ~ and bn = ri 2 ~ 
cr 2 cr 1 


1 See page 268. 


2 See pages 248 and 249. 



296 


MULTIPLE CORRELATION 


The quantity 612 measures the regression of Xi on X 2 and 6 2 1 meas- 
ures the regression of X 2 on Xi when all other factors are ignored . We 
also found that 

r\ 2 = b x2 • 621 (21) 

Similarly, bn.z and 621.3 measure the regression of X x on X 2 and 
of X 2 on Xi respectively in the equations 1 

Xi = 612.3X2 + 613.2X3 -f- Ci and X2 = 621.3X1 + 623.1X3 + C2 

when X 3 is held constant but not ignored. Since the conditions leading 
to equation (21) in bivariate correlation are exactly paralleled here, 
we define the partial correlation coefficient ri 2 .3 between Xi and X 2 
for an assigned value of X 3 by the equation 

r\ 2 .z — bn.z * 621.3 or r x2 .z — 'V / 612.3621.3 


In terms of the constants previously determined in (17) we find 


rn.z = 


zb 



Oj. 

02 


D 2 I & 2 
D 22 cr 1 


± P\2 
V D\\D 22 


An 
yj A n^4 22 


( 22 ) 


since the major determinant is symmetrical about the principal 
diagonal and hence Dn = Du. The sign attached to n 2 .3 is that of 
bn.z or 612. 

It is noted that ri 2 .3 is generally unequal to r X2 - The quantity ri 2 
measures the degree of correlation between Xi and X 2 when all 
other factors are completely ignored whereas ri 2 .3 measures the de- 
gree of correlation between Xi and X 2 when X 3 is held fixed but not 
ignored. The principal application of partial correlation is thus 
approximating what the correlation between two variables would 
be if the influence of other variables was eliminated. 

Professor Sorenson 2 gives an interesting illustration that shows 
the influence of the third variable on the correlation between the 
other two variables. In his illustration 

Xi represents the carpal area of children 
X 2 represents the mental age of children 
X 3 represents the chronological age of children 

1 The subscripts following the point merely indicate the variables that are 
held fixed in the development. They may frequently be omitted from the detail. 

* Herbert Sorenson: Statistics for Students of Psychology and Education , 
p. 252. 



PARTIAL CORRELATION 297 

The following simple correlations were obtained. 

ri2 = 0.83 ri3 = 0.92 r 23 = 0.88 

Naturally we are impressed by the apparently large correlation 
between the skeletal development (carpal area) of children and 
their mental age. When we “ partial out” or remove the influence 
of the third factor, chronological age, we find 

D\2 0.0204 

ri 2.3 = ■ ; — ■ = = - 7=— = 0.11 

VDuDn V (0.2256) (0.1536) 

which indicates very slight, if any, correlation. 

EXERCISES 

1. Express ri 2 . 3 in terms of simple correlation coefficients. 

2. Write down the values of r 13 . 2 and r 23 .i. 

3. Show that: *Si (2 3 ) = {l — r‘i 2 )(l — ri 3 . 2 ) 

= <71^(1 ~ r‘j 3 ) ( 1 - ri 2 . 3 ) 

4. By permuting the subscripts in number 3 preceding, write down the 
values for S 2 (i 3 ) and S 3 <i 2 ). 

6. In a certain study of a group of students* grades 

Xi denotes the percentage grades in mathematics 
X 2 denotes the percentage grades in chemistry 
X 3 denotes the percentage grades in history 

— 72 o i =: 8 7*12 — .b 

AIo — 68 o ' 2 ~ 10 7*13 = .4 

M 3 ~ 78 (7 3 — 7 r 23 — .3 

What is the probable grade in chemistry of a student whose grades are: 
mathematics, 80%; history, 70%? 


77. THE CASE OF FOUR VARIABLES 

In the preceding sections we have considered in great detail the 
case of multiple and partial correlation based upon three variables. 
We shall greatly abbreviate the theory for the case of four variables 
leaving the details to be supplied by the reader. 

Assume that we have N sets of data in the four variables X h X 2 , 
Xz } X4 and that we wish to determine the regression coefficients 



298 


MULTIPLE CORRELATION 


&12, 613, &i4, and the constant c so that Xi computed from the hyper- 
plane 

Xi = 612X2 + 613X3 + 614X4 + c ( 23 ) 

may be the best estimate of Xi for assigned values of X 2 , X 3 , X 4 . 
Adopting the least-squares criterion, we may determine the regression 
coefficients so that 

2p 2 = S[Xi - (612X3 + 613X3 + 614X4 + c)J ( 24 ) 

shall be a minimum. 

Equating to zero the first partial derivatives of 2 p 2 with respect to 
c , 612, 613, 614, we obtain the normal equations 

6 x 2 2X 2 + 6i 3 2X 3 + 6i 4 2X 4 + Xc = 2Xi 
6 i 2 2X? + 6 13 2X 2 X 3 + 6 14 2X 2 X 4 + c2X 2 = 2X x X 2 
6i 2 2X 2 X 3 + 6132 X 3 + 6 14 2X 3 X 4 + c2X 3 = 2XiX 3 
6 i 2 2X 2 X 4 + 6132 X 3 X 4 + 6i 4 2X 4 + c2X 4 = 2X 4 X 4 

By dividing the first of equations ( 25 ) by X, we may show that 
the hyperplane ( 23 ) for the values of 6 i2 , 613, 614, c given by ( 25 ), 
passes through the point (Mi, M 2 , M 3 , M 4 ). 

Referring our data to this point as origin our regression equation 
becomes 

xi = 6i 2 x 2 + 613X3 + 6 u x 4 ( 26 ) 

where x» = X» — Mi, i = 1 , 2 , 3 , 4 

That is, our regression equation is of the form ( 26 ) when the variables 
are deviations from their respective means. 

By minimizing the sum of the squares of the Xi-residuals, 

2 p 2 = 2 [xi — (612X2 + 613X3 + 614X4)] 2 

we obtain the normal equations 

6i 2 2x! -f* 6i 3 2x 2 x 3 ~f~ 6 i 4 2x 2 x 4 == 2xix 2 1 

6 i 2 2 x 2 x 3 + 6 i 3 2 xl + 6 i 4 2 x 3 x 4 = 2 xix 3 \ ( 27 ) 

6i 2 2x 2 x 4 4 “ 6132x3X4 -f* 6 J 4 2 x 4 == 2xix 4 J 

Expressing the summations in terms of standard deviations and 
coefficients of correlation, equations ( 27 ) become 




THE CASE OF FOUR VARIABLES 


299 


bi2<f2 4" 6l30’3^23 + = <T\T\ 12 

6120*2^23 + ^13^3 + buO " irz 4 *= < 7 l 7 l 3 

^12<T 2^24 + ^13^34 + 6uCT4 = <Ti7i4 

Let D denote the major determinant: 


rn 

ri2 

713 

714 


1 

rij 

ru 

714 

r 2 i 

722 

723 

724 


721 

1 

723 

724 

rzi 

7 32 

733 

734 


731 

f n 

1 

734 

rn 

742 

743 

744 


7 4 1 

Til 

743 

1 


Further, let A* be the minor and Ahk the co-factor of r hfc so that 
A h k = ( — l) fc+fc Afc. Then the solutions of (28) become 


h = — ^ 12 ^ ^ 1 

0"2 Dn ^*2 An 

^ ^ 1 jPia __ <7i A 13 

0*3 Dll 0’s An 

^ ~ ^ i4 — ^ 1 A i4 

(74 Dn 0*4 A ii 

and the equations of the regression hyperplane become 


<7i An 
0*2 An 

___ Vi A 13 

0's T u 

Oj Ah 

(74 Tn 


fin + %A U + s Hr ^Ai* = 0 

(7i (72 (73 (74 


(Xt-MO , , (Xt-Mt) t , (X 3 -M 3 ) „ , (X 4 -M 4 ) 

^ in — AiaH -Ais-h* 

(7i (7a (73 (74 


Ah = 0 (31) 


expressed in terms of the deviations from their respective means 
and the original variates respectively. 

If the respective deviations from the means be expressed in units 
of their standard deviations, that is, if 


Xi - Mi 


i = 1, 2, 3, 4 


C Ti di 

equations (30) and (31) become 

A \\h + A 12^2 + A 13^3 + A 14^4 = 0 


S^iTii » 0 

t- 1 



300 


MULTIPLE CORRELATION 


Adopting as a measure of the accuracy of fit of ( 30 ), ( 31 ), or ( 32 ) 
to the given observed values the quantity 


$ 1 ( 234 ) = 


2p! 

N 


after some rather tedious algebraic operations we find 

a _ A /niDii - rnDn + ~ r u D u 

01 ( 234 ) = <?1 y 


Dr 


= <Ti 


Equation ( 33 ) may also be written 

$1(234) = 0*1 V 1 — -fil(234) 


where 


R 


1(234) 


T\ 2 D\2 T\ 3 D\ 3 + 


= J 

-y/^l 


Dr 


( 33 ) 

( 34 ) 

( 35 ) 

( 36 ) 


Defining 7*12.34, the partial coefficient of correlation between X x 
and X 2 when the variables X 3 and X 4 are held fixed, by the equation 

7*12.34 = ± V (612.34) (&21.34) 
we immediately obtain 


Similarly 


7 * 12.34 ^ ± 

D12 _ 

A 12 

V D11D22 

V A n A 22 

7 * 13.24 = =*= 

Dl 3 _ _j_ 

A 13 

VD11D33 

VA11A33 

7 * 14.23 = ± 

Du _ 

A 14 

V D11D44 

VA11A44 


The signs of these values, 7*12.34, ^13.24, etc. are the same as 612, &13, etc. 
The following steps are recommended in the computation of the 
constants, assuming that the arithmetic means, the standard devia- 
tions, and the simple correlation coefficients have been computed. 

( 1 ) Write down D. 

( 2 ) Compute Dn, D22, Dm, D44. 



THE CASE OF FOUR VARIABLES 


301 


(3) Compute Du, Du, Du, D23, Am, D34. 

(4) Compute A&, An, etc. from Ak\ = (— 1 ) h+k Dhk- 

(5) Compute the value of D from the formula 

D = rnAii + TnAn + 7*1^4 13 + TuAu 

(6) Compute 612, 613, etc. 

(7) Compute ri 2 . 3 4, n 3 .24, n 4 .23. 

(8) Compute Rum) from equation (36). 

(9) Compute /S 1(23 4) from equation (34). 

(10) Write down the regression equation. 

EXERCISES 


1. The following table gives the fundamental constants obtained from 
the measurement of 450 eggs. 1 



Length (mm.) 

Breadth (mm.) 

Bulk (cc.) 

Weight (gm.) 

§ 

.0 

‘■+3 

J3 

TB 

t_ 

Length 

1.0000 

0.0837 

0.5751 

0.5797 

Breadth 

0.0837 

1.0000 

0.8602 

0.8357 

Bulk 

0.5751 

0.8602 

1.0000 

0.9804 

t- 

0 

O 

Weight 

0.5797 

0.8357 

0.9804 

1.0000 

Arithmetic 

Means 

Standard 

Deviations 

56.3222 

2.3862 

41.9167 

1.3777 

51.8400 

4.2438 

55.2400 

4.5923 


(a) Find the regression of weight upon length and breadth. 

(b) What is the estimated weight of an egg of the following measure- 
ments: length 56.03 mm., breadth 42.02 mm.? 

(c) Find the regression equation of weight on length and bulk. 

(d) Find the regression equation of weight on bulk and breadth. 

(e) Find the standard errors of estimate for (a), (c), and (d). 

(f) What is the best combination for estimating weight? 

2. The data of the following table were secured from measurements of 
450 freshmen at Syracuse University 2 : 

X\ — Academic success as measured by the number of honor points 
earned by the student during the first semester in college. 

1 Pearl and Surface: A Biometrical Study of Egg Production in the Domestic 
Fowl , Part III. 

* May, Mark: Predicting Academic Success. Journal of Educational Psy- 
chology, Volume XIV, pp. 429-440. 



302 


MULTIPLE CORRELATION 


X 2 = General intelligence based upon standardized tests. 

Xz = Industry and application as measured by the number of hours 
per week spent in study, 

X A = Quality of preparatory work based upon average high school grade. 

(a) Find the regression equation of X\ on X 2 , X 3 , and Xa. 

(b) Estimate Xi when X 2 = 108, X 3 = 32, and X 4 * 82. 

(c) Find R\( 2 M) and $ 1 ( 234 ). 

(d) Find r 12 . 3 4 , r 13.24, and 7*14.23. 



x l 

x 2 


Xi 

1 

*1 

1.00 

0.60 

0.32 

0.40 



0.60 

1.00 

- 0.35 

0.36 

1 

X, 

0.32 

- 0.35 

1,00 

o.n 

0 

0 

Xi 

0.40 

0.36 

o.n 

1.00 


M' s' 

18.5 

100.6 

24 

79 


<t’ s 

11.2 

15.8 

6 

7.5 


3. Show that 


7*12.34 


ri 2.3 ~ 7 * 14.3 7 * 24.3 
V (1 - ?M.3)(1 - f&.s) 


4. By permuting the subscripts in number 3 preceding, write down the 
values of 7 * 13.24 and 7 * 14 . 23 . 

6. In the following table the values are monthly averages. 

Xi =* Wholesale price of butter, 92 score, in i per lb. 

X 2 — Apparent consumption, millions of pounds. 

X 3 = Factory production, millions of pounds. 

Xa = Stocks in cold storage at end of month, millions of pounds. 


Factors Affecting Wholesale Price of Creamery Butter 


Year 

Xi 

Xa 

X, 

x 4 

Year 

Xa 

Xa 

X 3 

x 4 

1919 

61 

68 

72 

67 

1929 

45 

130 

133 

82 

1920 

61 

73 

72 

60 

1930 

37 

134 

133 

83 

1921 

43 

90 

88 

53 

1931 

28 

142 

139 

55 

1922 

41 

98 

96 

51 

1932 

21 

142 

141 

50 

1923 

47 

106 

103 

47 

1933 

22 

139 

147 

92 

1924 

43 

111 

113 

74 

1934 

26 

147 

141 

69 

1925 

45 

115 

114 

62 

1935 

30 

138 

136 

71 

1926 

44 

123 

121 

68 

1936 

33 

135 

136 

60 

1927 

47 

124 

125 

71 

1937 

34 

138 

135 

64 

1928 

47 

124 

124 

62 

1938 

28 

142 

149 

111 


THE CASE OF FOUR VARIABLES 


308 


( 1 ) Find the values: 


Mi = 

to 

11 

Mi = 

m 4 = 

<7i = 

(7 2 = 

< 7 * = 

<r 4 = 

ri2 = 

7 13 = 

7 14 = 


723 = 

724 - 

734 = 



( 2 ) Find the equation of the regression hyperplane with Xi dependent 
upon X 2 , X 3 , X 4 . 

(3) Find Ri( 2 m) and $h284). 


78. THE CASE OF n VARIABLES 


We shall give a summary of the results of multiple regression for 
the case of n variables leaving all of the details to be carried out by 
the student. 

Let Xi = 612X2 + 613X3 + • • • + 6i n X n + c ( 37 ) 

be the equation which gives the best value to Xi for any set of values 
of X 2 , X 3 , . . . , X n . 

If the regression coefficients 612, 613, etc., and c are determined so 
that the sum of the squares of the Xi-residuals is a minimum, the 
point (Mi, M 2 , . . ., M n ) is on the hyperplane ( 37 ). Transferring 
our data to this centroidal point as origin, the regression equation 
becomes 

xi = 612X2 + 613X3 + 614X4 + • • • + 6i n x n ( 38 ) 

where 

x t = Xi — M iy i — 1 , 2 , 3 , . . ., n 

Based upon the principle of least squares, the normal equations 
for the determination of the regression coefficients are 


6l20' 2 + 613(7 37*23 + * * * + 6i„(7 n 7’2n — <? \T 12 
6l 2 0’2r 2 3 + 613(73 + * * • + bi n (TnTzn = <71713 


( 39 ) 


6l2<7 2 r 2 n + 6i3<7 3 /2n + * * * + 6 in (7 n = <7i7i n 
Defining the major determinant D by the equation 



7n 

712 

713 

7 In 


1 

hi 

7*13 

An 


721 

722 

723 

7 2 ft 


721 

1 

723 


D - 

7nl 

7 n 2 

7»3 

• • • 7 n n 


7nl 

r„8 

7n3 ... 

1 


D 


( 40 ) 



304 


MULTIPLE CORRELATION 


we find 


613 = - 


__ __ ( T\A\2 
<TiDw 02A 11 

_ ( ^ 1 ^ 13 _ _ 0*1^13 
O’sDu 0's A 11 


b ln = (- 1) 


t O \Din _ _ (Tj A l n 

0* n Dn (Tn^u 


where D h k is the minor and A h k is the co-factor of r h k y 

Ah k = (— l) h+k Dhk 

The regression equation for determining the best value of xi for given 
values of X 2 , x 3 , . . . , x n , is 

+ . . . + ^A ln = 0 (.42) 

(Ti (72 cr n 

In terms of the original variates the equation of regression is 

+ (*LZJV Al2 +... + (^~ = 0 (43) 

(7 1 C7 2 <7 n 

Equations (42) and (43) may be written 

V * A _ “ ^») 4 A A //I 


t-ah = 

(T % mkbm O % 


Ait — ^t%Aii — 0 


where 


* — Xi — X' Mi • _ 1 OO „ 

— — > z — 1, 2, o, . . . , n 

(7t (7t 


Adopting as a measure of the goodness of fit of (42) to the given 
data the quantity 

- i/^" 2 
Ol(23.**n) y -pj- 


where p = xi — (612X2 + 6 i 3 x 3 + • • • + 6i n x n ), and the values of b U) 
i = 1, 2, 3, . . n, are given by (41), we find 


>Sl(23...n) — tfiy jfJ 


= Vl ~ /2i(23**»n) 



THE CASE OF n VARIABLES 


305 


where 


#1(23-. n) = y/l 


D 

D n 


Defining the partial coefficient of correlation ru.23— 
tion 


ru.23--n = V6u.23-.ri * 6*1.2 ; 


we find 


ru.23.-n = i 


Z) 


1* 


= = db 


Life 


V DnDkk An A kk 

The sign of r/u.o&.-n is the same as that of 6 **. 


(46) 

» by the equa- 

(47) 



Chapter 10 

NONLINEAR TRENDS: CURVE-FITTING 
79. INTRODUCTION 

The investigator in any branch of science is frequently confronted 
with quantitative data which, when plotted, seem to lie near a smooth 
curve and hence to obey, approximately at least, some mathematical 
law. Thus the following table gives the area Y (in square centi- 
meters) of a wound at the end of X days. 


Figure 41 



The fact that these data when plotted [Figure 41] lie very near a 
smooth curve leads us to suspect that they can be represented, 
approximately, by the equation of a curve. Such an equation, whose 
form is inferred from the results of experiment or observation and 
whose constants are determined from experimental or observational 
data, is known as an empirical equation . The empirical equation, 
once it is derived, is a summarizing expression for the observed data, 
and it may be used to obtain a good approximation to the value of 

306 



INTRODUCTION 307 

the true ordinate for a given abscissa within the range of values used 
in its determination. 

The problem of determining the type of equation to be used is 
an indeterminate one, for a number of curves can be drawn to pass 
very near the plotted points and hence a number of equations can 
be found to represent the data approximately. The choice of the 
proper mathematical function depends a great deal upon the investi- 
gator’s knowledge of the properties of curves and his experience 
in curve-fitting. Fortunately, there are a number of simple tests 
that may be employed to enable us to make an intelligent choice 
of the type of equation to be used. Of course one can select an 
equation in which the number of undetermined constants equals the 
number of the observations and thus have the resulting curve pass 
through the observed points exactly, but this process emphasizes 
the minor fluctuations that represent simply errors of observation 
and renders impossible the discovery of a simple law. A better 
procedure is to select a simple type of function involving only a 
few constants and thus allow for fluctuations due to sampling. 

Having chosen a particular type of function with which to graduate 
the data, our specific question is: How can the constants of the equation 
be determined in order to obtain the curve of that type of best fit f The 
method employed depends upon the desired degree of accuracy. 
We may employ one or more of four methods: (1) the method of 
selected points , (2) the method of averages , (3) the method of least squares , 
or (4) the method of moments. Of these methods the first is the 
simplest; the second requires more computation than the first but 
usually gives better results; the third requires considerable compu- 
tation but gives the best results and a unique answer to our question; 
the fourth gives a unique answer that is identical to that obtained 
by the third for polynomial functions. 

80. THE PROCESS OF DIFFERENCING 

In the preceding section we alluded to certain simple tests that 
may be employed to assist us in choosing the appropriate type of 
equation to represent our data. Inasmuch as these tests will fre- 
quently be stated in the language of differences , it may be well that 
we digress at this point from our general problem to learn the rudi- 
ments of this language. Consider the following table: 



308 NONLINEAR TRENDS: CURVE-FITTING 


Table 66 


X 

Y 

AF 

a 

A*F 

A 4 F 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

0 

1 

3 




1 

4 

6 

3 

1 


2 

10 

10 

4 

1 

0 

3 

20 

15 

5 

1 

0 

4 

35 

21 

6 

1 

0 

5 

56 

28 

7 



6 

84 






Corresponding values of X and F, where Y is some undetermined 
function of X, are given in columns (1) and (2). In column (3), 
headed A F, we have the first differences of Y. Any value of Ay 
is found by subtracting a value of Y from its successor. Thus, 
3 = 4 — 1, 6 = 10 — 4, etc. Similarly column (4), headed A 2 F, 
is obtained by subtracting each AF from its successor. These 
values are called the second differences of F. Other differences are 
found in a similar manner. In the table we are considering it may 
be noted that the values of A 2 F are in arithmetic progression, those 
of A 3 F are constant and hence all higher differences are zero. 

EXERCISE 

Begin at the right-hand side of Table 66, work back to the left and show 
that when X = 7, F = 120. 

The values of X may differ by amounts other than unity. In 
general we may indicate by AX the difference in X. When the 
difference in successive X’s is the same — that is, when AX is 
constant — and F is a function of X: 

A Y x = - Y x (1) 

In the following table, where again F is an undetermined function 
of X, we have, for example, AX = 2. 



THE PROCESS OF DIFFERENCING 309 


Table 67 


X 

Y 

AF 

A 2 F 

A*F 

0 

0 

2 



2 

2 

6 

4 

0 

4 

8 

10 

4 

0 

6 

18 

14 

4 


8 

32 





In experimental data involving two variables, the independent 
variable is usually subject to the control of the experimenter, and 
the values of the independent variable are frequently given in 
arithmetic progression. That is, if X is the independent variable, 
AX is frequently constant. We shall see that this precaution on 
the part of the experimenter may greatly simplify the discovery of 
an appropriate equation. 

Consider the straight line: 

F = mX + b 

We have from (1) : 

AY = m{X + AX) + b - (mX + b) 

AY = m • AX 


From this result it is seen that if AX is constant, AT is also con- 
stant; further, AY /AX is constant (compare Section 57, p. 204). 
Consider now the parabola: 

Y - aX 2 + bX + c 

Applying (1) : 

AF = a(X + AX) 2 + b(X + AX) + c - (aX 2 + bX + c) 

AY = 2aXAX + 6AX + a(AX) 2 
A (Ay) = A 2 F = 2a (X + AX)AX + bAX + a(AX) 2 

- 2aXAX - bAX - a(AX) 2 

A 2 F = 2a(AX) 2 

From this result we see that if AX is constant, the second difference 
of the polynomial aX 2 4* bX 4* c is also constant. 



310 NONLINEAR TRENDS: CURVE-FITTING 

One may continue this process and show that if AX is constant , 
the nth difference of a polynomial of the nth degree is also constant. 

The converse of this theorem is also true, namely: 

If for a constant AX , A n Y is also constant , then Y is a polynomial 
in X of degree n . 1 

The nth differences of the values of Y obtained from observational 
data are seldom constant. • If, however, the nth differences of Y are 
approximately constant, AX being constant, we can represent the 
data approximately by: 

Y = aX n + bX n ~ l +•••+* 


EXERCISES 

1. If Y = c, show that A Y = 0. 

2 . If F = X 3 , show that A 3 F = 6 (AX) 3 . 

3. Prepare a table for the function Y =* 2X 2 — 3X + 4 for X = 0, 1, 
2, 3, 4 and find the second differences from the table. 

4 . Prepare a table for the function Y = X 3 — X 2 + 8X + 2 for 
X = 1, 3, 5, 7, 9 and find the third differences from the table. 

6. In the following table, AX is constant (= 1) and A 2 F is constant 
(=2). Hence Y is a quadratic function of X: 

Y = aX 2 + bX + c 

Find the values of a, b, and c. Find Y when X = 5. 

Hint: 

A 2 F = 2 = 2a(AX) 2 = 2a(l) 2 = 2a 



1 For a proof, see T. R. Running, Empirical Formuta8 f p. 18. 



THE PROCESS OF DIFFERENCING 


311 


X 

Y 

AY 

A»F 

A»F 

0 

0 




1 

- 1 




2 

4 




3 

21 




4 

56 





6. Complete the accompanying 
table. Find the function that 
represents the data. Find Y when 
X = 5. 


7. Prove that if a sequence of numbers is in geometric progression, their 
logarithms are in arithmetic progression. 

8. Prove that if a sequence of numbers is in geometric progression, their 
first differences are in geometric progression. 


81. FITTING A STRAIGHT LINE TO OBSERVED DATA 

A large portion of Chapter 7 was devoted to the problem of fitting 
a straight line to observed data by the method of least squares. 
We desired at that time to emphasize the method of least squares 
because we were then interested in finding a unique line for which 
we could secure a test for the goodness of fit and thus arrive at the 
Bravais-Pearson cross-product coefficient of correlation. Since one 
may frequently not desire so accurate a solution as is given by the 
method of least squares — especially at the price of tedious computa- 
tion one must pay to secure it — we shall discuss two other less ac- 
curate methods. 

A. The Method of Selected Points. To apply this method we 
must plot the observed data carefully. We then draw a straight 
line among the points which will pass as near as possible to each of 
them. Since the straight-line equation 

Y = mX + b 

has two undetermined constants, m and 5, we must obtain two equa- 
tions with m and b as unknowns from which to determine them. If 
the line happens to pass through two of the plotted points or through 
any other two points whose coordinates can be determined approxi- 
mately, we can substitute their coordinates in the given equation and 
solve the two resulting equations for m and b. In any case the points 
so used should be as far apart as possible. 



312 NONLINEAR TRENDS: CURVE-FITTING 

Consider again the temperature-resistance data of Table 42, 
to which we have previously given attention in Section 59 (p. 211). 



Table 68 


t 

R 

10.5 

10.42 

29.5 

10.94 

42.7 

11.32 

60.0 

11.80 

75.5 

12.24 

91.1 

12.67 


These data, when plotted, present six points that may seem to lie 
upon a straight line. Let us seek further evidence by applying the 
test for straight-line data. We have learned in the preceding section 
that if A Y / AX is constant the data can be fitted to a straight-line 
equation. In Table 69 we have computed the several values (/ 


Table 69 


At 

t 

R 

AR 

AR/At 


10.5 

10.42 



19.0 

29.5 

10.94 

0.52 

0.0274 

13.2 

42.7 

11.32 

0.38 

0.0289 

17.3 

60.0 

11.80 

0.48 

0.0277 

15.5 

75.5 

12.24 

0.44 

0.0284 

15.6 

91.1 

12.67 

0.43 

! 

0.0276 




Mein 

= 0.0280 


AR/ At. Since they are approximately constant we are justified in 

concluding that the data may be fitted approximately to a straight- 

line equation: ^ . , 

R = mt + b 



FITTING A STRAIGHT LINE TO OBSERVED DATA 313 


The line we have drawn does not pass through any of the given 

points. However, it seems to pass through the points A(20 } 10.7) 

and 5(90, 12.6) whose ordinates we have estimated from the graph. 

Substituting in the given equation the coordinates of the points 

we have: , , m _ » 

b + 20m = 10.7 

b + 90m = 12.6 

from which we obtain 


m = 0.027 b = 10.157 

Hence the required relation is: 

R = 0.027* + 10.157 

The least-square solution (Exercise 1 on p. 220) gives: 

R = 0.02799* + 10.122 


EXERCISE 

Assume the line passes through the first and last points, (10.5, 10.42) 
and (91.1, 12.67), and find its equation. 

It will be noted that the arithmetic mean of the values of AR/ At 
in Table 69 is 0.0280. How may this be used in finding an equation 
for a line fitting our data approximately? 

If we take this average slope as the slope of our required line we 

have: R = 0.0280* + b 

We can now substitute the coordinates of each of the six given 
points and thus determine six values of 6. Their mean may be taken 
as the value of b for the required line. We shall leave the computa- 
tion as an exercise for the student. He should receive for an answer: 
R = 0.0280* + 10.1216 

B. The Method of Averages. The fundamental principle of the 
method of averages is that an empirical curve of given type best 
fitting a given group of points is one for which the algebraic sum of 
the residuals is zero. (It will be recalled that this criterion was satis- 
fied by the line determined by the method of least squares. 1 ) From 
Section 59 (p. 210), if p» is any residual: 

pi = Yi — mXi — b 
2 pi = 2 Yi- m2 Xi - nb 

1 See Exercise 5 on p. 221. 


and 



314 NONLINEAR TRENDS: CURVE-FITTING 

Since the sum of the residuals is zero we have: 

m2X + nb = 2F 

In order to obtain two equations which may be solved for the 
unknowns, m and 6, we divide our data, Table 68 (p. 312), into two 
groups each containing three sets of data. For the first group we 
choose the first three sets of data for which 2^ = 82.7, n = 3, 2/2 
— 32.68, and for the second group the remaining three sets of data 
for which 2^ = 226.6, n — 3, 2/2 = 36.71. We then have the 
equations: 

82.7m + 3 b * 32.68 
226.6m + 3 6 « 36.71 

from which we obtain 

m = 0.0280 b = 10.121 

Hence the required relation is: 

R = 0.0280/! + 10.121 

C. The Method of Least Squares. Curve-fitting by the method 
of least squares is based upon the principle that the empirical curve 
of a given type best fitting a given set of points is that one in which 
the constants are so determined that they will make the sum of the 
squares of the residuals a minimum. Since the squares of the resid- 
uals are positive quantities, the requirement that their sum shall be 
a minimum gives assurance that the numerical values of the residuals 
will be such that the best-fitting curve will pass as close as possible 
to all the points. 

Inasmuch as Section 59 (p. 210) was devoted to the problem of 
fitting the line 

Y = mX + b 

to a set of points by the method of least squares, we shall merely 
recapitulate here the findings of that section. By minimizing 

2p? - 2 (Yi - mXi ~ b) 2 

where p* is the F-residual of the ith point, we obtain the normal 
equations 


m2X + nb = 2F 
m2X 2 + 62X * 2 XY 



FITTING A STRAIGHT LINE TO OBSERVED DATA 315 
which, when solved, gave: 

nZXY - 2X27 
m ~ n2X 2 - (2X) 2 
2X 2 27 - 2X2X7 
b “ n2X 2 - (2X) 2 

EXERCISES 

1. Show that the point ( M x , M Y ) is on a line determined by the method 
of averages. 

2. The following table gives the population of France at each census 
from 1806 to 1866. Determine by the method of averages a straight line 
well adapted to the data, choosing X = 0 at 1836. 


Population of France, 1806-1866 


Year 

Population 

Year 

Population 

{millions) 

(millions) 

1806 

29.11 

1851 

35.78 

1821 

30.46 

1856 

36.04 

1831 

32.57 

1861 

37.39 

1836 

33.54 

1866 

38.07 

1846 

35.40 




3. Find by the method of least squares the equation of the best-fitting 
straight line to the data of the following table. What are the predicted 
net earnings, based upon this line, for the year 1929? The actual net 
earnings were 48.5 millions. 


Annual Earnings of the Associated Gas and 
Electric System, 1920-1928 1 


Year 

Net Earnings 
(miUions of dollars) 

Year 

1 

Net Earnings 
(millions of dollars) 

1920 

13.4 

1925 

29.5 

1921 

16.2 

1926 

33.5 

1922 

19.2 

1927 

37.8 

1923 

22.7 

1928 

40.6 

1924 

25.1 

1929 



1 The data are from Time, Jan. 27, 1930. 





316 


NONLINEAR TRENDS: CURVE-FITTING 


82. THE EXPONENTIAL FUNCTION Y = ab x 

Excepting the linear function, probably no expression with two 
undetermined constants is more useful in characterizing observed 
data than the exponential function Y = ab x . It may be described 
as that function whose rate of change is proportional to the value 
of the function. The rate of change may be positive or negative, 
that is, Y may increase with X or 7 may decrease as X increases. 
Because the accumulated amount of a sum of money placed at com- 
pound interest at a given rate for a given time is expressed by this 
function, it is known as the compound interest law . Thus, if $100 is 
placed at compound interest for X years at 5 per cent the accumulated 
amount Y is given by : 


Y = 100(1.05) x 

We represent this function graphically. 


Figure 43 


Table 70 



X 

Y 

0 

100.00 

5 

127.63 

10 

162.89 

15 

207.89 

20 

265.33 

25 

338.64 

30 

432.19 

35 

551.60 

40 

704.00 

45 

898.50 


The exponential function is also called the law of organic growth 
because many biological phenomena obey closely this law of growth. 
For examples, a culture of bacteria, or populations of mice, of 
rabbits, of human beings, when placed in environments conducive 
to growth, will increase for a time in approximate accordance with 
this law. 

The exponential law is applicable to many other types of data. 
Many data from the commercial and the economic fields show ex- 
ponential trends. We find the law especially applicable to data on 



THE EXPONENTIAL FUNCTION 31 ? 

production and to data on the periodic earnings of many industrial 
organizations. 

A simple test for the exponential function is contained in the 
following: 

Theorem I. If the values of X are in arithmetic progression and 
the corresponding values of Y are in geometric progression , the relation 
between the variables is expressed by the equation: 

Y = ab x 


Table 71 


X 

y 

x, 

Y i 

Xi = X, + AX 

F, = rF, 

Xi = X, + 2AX 

. II 

. ^ 

X n = X, + (n - 1)AX 

F„ = r n-I Fi 


From the hypothesis we have the data as shown in the accompany- 
ing table. Since 


we have: 


and hence 


or 

when 


X n = X x + (n - 1)AX 


n — 1 = 


An - X l 
AX 


Xn — JCl 

y, = y ir ~5 3T 

zlx ( _L\*» 

= Y<r 


7„ = ab x * 

-x, 1 

a = and b = r^x 


That is, any X is connected with the corresponding Y by the 
relation: 


Y = ab x 


( 2 ) 



318 NONLINEAR TRENDS: CURVE-FITTING 

Illustrative Problem 1. Consider Table 72, which gives the popu- 
lation of the United States at each ten-year census from 1800 to 1890. 

Tajble 72. Population of the United States, 1800-1890 



Year 

Population 
( millions ) 

Ratio of Each Popu- 
lation to the One 
Above 

t 

X 

P 


0 

1800 

5.3 


1 

1810 

7.2 

1.36 

2 

1820 

9.6 

1.33 

3 

1830 

12.9 

1.34 

4 

1840 

17.1 

1.33 

5 

1850 

23.2 

1.36 

6 

1860 

31.4 

1.35 

7 

1870 

38.6 

1.23 

8 

1880 

50.2 

1.30 

9 

1890 

63.0 

1.25 



Mean 1.3167 


Let t « (X - 1800J/10. 

Here we note that the values of X (and t) are in arithmetical pro- 
gression and that the values of P are approximately in a geometric 
progression since the ratio of any population to the one preceding 
is approximately constant. Hence we may assume that the data 
follow approximately the exponential law: P t = ab *. 

If we assume that the point t — 0, P = 5.3 is on the curve, and 
that the decade rate of increase is the arithmetic mean of the ratios, 
we can immediately obtain a first approximation formula: 
p o = 5.3 = ab° = a 
a = 5.3 

Since by definition 

P t = a(1.3167)‘ * a6‘ 

we have: 

b = 1.3167 

Hence 

P t = 5.3(1.3167)* 

may be considered a first or crude approximation. By assigning 
t = 0, 1, 2, 3, . . . , 9, the computed values of P h which can be com- 
pared with the observed values, can be found. 



THE EXPONENTIAL FUNCTION 


319 


For a closer approximation we proceed as follows. From 
P = ab l 

we have: 

log P =* (log b)t + log a 

Now let Y = log P, m = log 6, A; = log a. 

We then have: 

Y ~ mt + k 

which is a straight line. Therefore we can fit the curve P = ab l to 
the given data by fitting the line Y = mt + k to the corresponding 
(f, Y = log P) data. We shall do this by the method of least squares. 
From Sections 59 (p. 218) and 81 (p. 315) we have: 

nLtY — 2 ffiF nLt log P — XtZ log P 

ffY) ss „ ■■■' — » ' rss - — .... 

riLV - (S<) 2 nZP - (SO 2 

S* 2 SF - S<SiF _ S< 2 S log P - S<S< log P 
nS< 2 - (SO 2 nSi 2 - (SO 2 

We shall use the following form with eight-place logarithms to 
assist in finding m and k. 

Table 73 


t 

P 

log P 

i 2 

t log P 

Computed 

P 

0 

5.3 

0.7242759 

0 

00.0000000 

5.5 

1 

7.2 

0.8573325 

1 

00.8573325 

7.3 

2 

9.6 

0.9822712 

4 

1.9645424 

9.6 

3 

12.9 

1.1105897 

9 

3.3317691 

12.7 

4 

17.1 

1.2329961 

16 

4.9319844 

16.8 

5 

23.2 

1.3654880 

25 

6.8274400 

22.2 

6 

31.4 

1.4969296 

36 

8.9815776 

29.3 

7 

38.6 

1.5865873 

49 

11.1061111 

38.6 

8 

50.2 

1.7007037 

64 

13.6056296 

51.0 

9 

63.0 

1.7993405 

81 

16.1940645 

67.3 

45 


12.8565145 

285 

67.8004512 



m = log b 


10(67.8004512) - 45(12.8565145) 
10(285) - (45) 2 


0.1205592 


b = 1.319967 


k = 


log a = 


285(12.8565145) - 45(67.8004512) 
10(285) - (45) 2 



320 


NONLINEAR TRENDS: CURVE-FITTING 


log a = 0.7431349 
a = 5.535186 

Hence our law is: 

P = 5.535186(1.319967)* 
or 

log P = 0.1205592* + 0.7431349 

By assigning * = 0, 1, 2, . . . , 9 we obtain the computed values of 
P which are found in Table 73. 

We can use this formula to predict the populations in 1900, 1910, 
1920 by assigning t = 10, 11, 12. We find the predicted populations 
to be 88.9, 117.3, and 154.8, whereas the actual populations were 76.0, 
92.0, and 105.7. This shows that an empirical formula must be 
used with caution for values outside the given abscissal range. In 
this particular case the exponential law ceased to operate after 1870 
and we began then to approach the point of saturation. 

We shall leave it as an exercise for the student to find the law for 
the population based upon the method of averages. 

It frequently happens that the data are not given with the values 
of the independent variable in arithmetic progression and hence 
the test of Theorem I will not apply. In such cases we can use the 
following: 


Theorem II. If the variables X and Y are so related that A log Y/ AX 
is constant , then the relation between them can be expressed by the 
formula: 


Since by hypothesis 


Y = ab x 


A log Y 
AX 


= m, 


we have by Section 80 (p. 310): 


or, if k = log a, 


log Y = mX + k 


log Y = mX + log a 
log Y - log a = mX 
log (7/a) = mX 

Y/a « 10 mX = (10*0* = b x 


or 


where 10 m =* b . 


Y = ab x 



THE EXPONENTIAL FUNCTION 


321 


We shall apply this theorem to 

Illustrative Problem 2. The following table shows the amount A of 
a substance remaining in a reacting chemical system at the expiration 
of a given time t (Harcourt and Esson). 


Table 74 


t 

A 

log A 

At 

A log A 

A log A 

At 

2 

94.8 

1.9768083 

3 

- 0.0328194 

- 0.0109 

5 

87.9 

1.9439889 

3 

- 0.0338984 

- 0.0113 

8 

81.3 

1.9100905 

3 

- 0.0356087 

- 0.0119 

11 

74.9 

1.8744818 

3 

- 0.0375251 

- 0.0125 

14 

68.7 

1.8369567 

3 

- 0.0307767 

- 0.0103 

17 

64.0 

1.8061800 

10 

- 0.1133331 

- 0.0113 

27 

49.3 

1.6928469 

4 

- 0.0493942 

- 0.0123 

31 

44.0 

1.6434527 

4 

- 0.0512759 

- 0.0128 

35 

39.1 

1.5921768 

9 

- 0.0924897 

- 0.0103 

44 

31.6 

1.4996871 





The values of A log A /At are fairly constant and we conclude 
therefore that the data may be represented approximately by 

A = ab l 
or 

log A = (log b)t + log a 
or 

Y = mt + k 

when Y = log A } m = log 6, and k = log a. 

We shall use the method of averages to determine the constants. 

Dividing the data into two groups, the first five sets of data for 
the first group and the remaining five sets of data for the second 
group, we obtain: 

SF - t log A = 9.5423262, Si = 40, n = 5 

SF - S log A = 8.2343435, Si = 154, ft = 5 



322 NONLINEAR TRENDS: CURVE-FITTING 

Recalling that in the method of averages the sum of the residuals 
is zero; that is, 

2(7 - mt - k) = 0 
or 

27 = mlt + nk 

we have upon substituting the above values: 

5 k + 40m = 9.5423262 
5k + 154m = 8.2343435 

Solving, we obtain: 

m = log b = - 0.0114735 
k = log a = 2.0002532 

b = 0.973927 
a = 100.0586 

Hence the law is : 

A = 100.0586(0.973927)* 
or 

log A — — 0.01 147352 + 2.0002532 

By substituting the given values of t we obtain the computed 
values of log A from which we obtain the computed values of A 
which are shown in the following table. 


t 

Observed 

A 

Computed 
log A 

Computed 

A 

Residuals 

2 

94.8 

1.9773062 

94.9 

- 0.1 

5 

87.9 

1.9428857 

87.7 

0.2 

8 

81.3 

1.9084652 

81.0 

0.3 

11 

74.9 

1.8740447 

74.8 

0.1 

14 

68.7 

1.8396242 

69.1 

- 0.4 

17 

64.0 

1.8052037 

63.9 

0.1 

27 

49.3 

1.6904687 

49.0 

0.3 

31 

44.0 

1.6445747 

44.1 

- 0.1 

35 

39.1 

1.5986807 

39.7 

- 0.6 

44 

31.6 

1.4954192 

31.3 

0.3 


Exercise. Solve this problem by method of averages using four-place 
logarithms. 



THE EXPONENTIAL FUNCTION 


323 


EXERCISES 1 

1. Show that an exponential curve may give a satisfactory fit for the 
data of Table 65 (p. 306). Fit an exponential curve to these data and esti- 
mate from the equation the values of Y when X = 32 and when X = 36. 
Compare these results with the actual values: Y = 21.3 when X = 32, 
and Y = 16.8 when X = 36. 

Answer: Method of least squares gives Y = 108.8035(0.952348)*. 

2. In the following table p is the barometric pressure in inches of a 
column of mercury at distance h in feet above the sea level. Show that 
an exponential curve, p — ab h , may be appropriately applied to these data. 
Find the equation of the best-fitting curve and the values of p when h 
= 1,000 ft., 2,000 ft., 5,000 ft. 


h 

0 

886 

2,753 

4,763 

6,942 


V 

30 

29 

27 

25 

23 



3. The following table exhibits the values of the temperature T reached 
by a cooling body at the expiration of various times t. Determine the best- 
fitting curve of the type T — ab l for the data of this table. 


E8 






44.11 

59.12 

89 






9.8 

8.0 


4. Fit an exponential curve to the data of Exercise 1, page 91. 

6. The following observations were made on a growing plant. The 
time is reckoned in days from the first observation. What is the law of 
growth? 


Days 



2 




6 


8 

Height ( inches ) 









11.50 


83. THE POWER FUNCTION Y = aX b 

In the preceding sections of this chapter we have dealt with the 
problems which involved fitting the linear function Y = aX + b and 
the exponential function Y = ab x to observed data. A third function 
with two undetermined constants, the power function Y = aX b y 
finds frequent application. Owing to the fact that the constants 
can be determined approximately by rectifying the curve, that is, 
by transforming it into a straight-line equation — as was done 

1 We leave it to the discretion of the teacher to suggest the method that is to 
be used in solving these exercises. 




















324 NONLINEAR TRENDS: CURVE-FITTING 

with the exponential function — the power curve is not difficult to 
employ. 

The power function is parabolic in form when b is positive and 
hyperbolic if b is negative. The parabolic curves all pass through 
the points (0, 0) and (1, a) and also enjoy the property that Y in- 
creases with X. The hyperbolic curves all pass through the point 
(1, a), have the coordinate axes as asymptotes, and enjoy the 
property that Y decreases as X increases. 

EXERCISE 

Plot on the same coordinate axes for X > 0 the curves: 


a. 

Y = 2X 2 

d. 

Y = 2X~* 

b. 

Y = 2X 

e. 

Y = 2X~ l 

c. 

Y = 2X ai 

f. 

Y = 2X-o- i 


A simple test — not always applicable — for determining if the 
power function is applicable is contained in: 

Theorem I. If the values of X are in geometrical progression and 
the corresponding values of Y are also in geometrical progression , then 
the relation between the variables is expressed by the formula : 

Y = aX b 

Table 75 


X 

Y 

Xi 

Yx 

•a 

H 

Y 2 = RY , 

i- 

H • 

>< ■ 

Y 3 = ft’F, 

X n = r n_1 Xi 

Y n = R'-'Yi 


From the hypothesis we have the data as in Table 75. Since 

X n = r n ~ l X i and Y n = 

we have, applying logarithms: 

, log X n - log Xi 

n — - l = \ 

log r 

and 

„ , log Y, - log Y, 

n_1 iSfB 



THE POWER FUNCTION 325 

Equating these values of (n — 1) and writing log R/ log r = 6, 
we have 

log F n - log Y i _ h 
log X n - log X x 

that is: 

log YJYx = log (Xn/Xtf 
or 

Fn = (Fi/X \)Xl = 

where 

b = log R/ log r and a = YJX\ 

That is, for any set of corresponding values we have: 

F = 

There is an evident practical difficulty with this beautiful theorem. 
There is rarely any reason, or even an opportunity, for the observer 
to gather his data with the values of one variable in geometric pro- 
gression and thus make possible a test to determine if the other 
variable is also in geometric progression. In general the observer 
has no predilections as to the law; he gathers the data and may hope 
to discover a law. Very frequently, however, the careful observer 
will, if possible, secure data with the independent variable ordered 
in some definite manner, most frequently in arithmetic progression. 

When Theorem I may not be applicable we may be able to use the 
following: 

Theorem II. If the values of X and Y are so related that 
A log F/A log X is constant } then the relation between the variables is 
expressed by: 

Y = aX b 

Since 

A log F . 

_ J 2 = 6, a constant, 

A log X 

we have by Section 80 (p. 310): 

log F = b log X + c 

log F = log X b + log a (if c = log a) 

log F = log aX b 
or 

F = aX 6 

Consider Table 76, which shows the currents, i, in amperes passing 
ctirough an 118- volt tungsten lamp for various terminal voltages, 6. 



m 


NONLINEAR TRENDS: CURVE-FITTING 


Table 76 


e 

i 

Ratio of Any i 
to Preceding 

2 

0.0245 


4 

0.0370 

1.51 

8 

0.0570 

1.54 

16 

0.0855 

1.50 

32 

0.1295 

1.51 

64 

0.2000 

1.54 

128 

0.3035 

1.53 


We note that the independent variable, e y is given in a geometric 
progression. We find, as the table shows, that the corresponding 
values of i are also essentially^ in geometric -progression. Therefore 
the data follow the law: 

i = ae b (4) 

Since 

log i = 6 (log e) + log a (5) 

if we let Y = log i, X ** log e, k = log a 

we have: 

Y = bX + k (6) 

which is a straight line. Therefore we may approximately fit the 
curve (4), i = ae b f to the given data by fitting the line (5), Y = bX + k , 
to the corresponding ( X = log e, Y = log i ) data. 

We shall first use the method of averages. 

Table 77 


e 

i 

log 6 = X 

log i — Y 

2 

0.0245 

0.3010300 

2.3891661 

4 

0.0370 

0.6020600 

2.5682017 

8 

0.0570 

0.9030900 

2.7558749 

16 

0.0855 

1.2041200 

2.9319661 

32 

0.1295 

1.5051500 

1.1122698 

64 

0.2000 

1.8061800 

1.3010300 

128 

0.3035 

2.1072100 

1.4821587 


THE POWER FUNCTION 


327 


Dividing the data up into two groups,* the first four sets constitut- 
ing the first group and the last three sets the second group, we have: 

n = 4, 2X = 3.0103000, 2F ~ 6.6452088' 

n = 3, 2X = 5.4185400, 2F = 3.8954585 

Substituting these values in the residual equation 

62X + nk = 2F 

we have: 

3.010306 + 4fc = 4.6452088 - 10 
5.418546 + 3k * 7.8954585 - 10 


Solving, we have: 

6 = 0.6047655 

fc = log a = 2.2061708 

a = 0.016076 

Hence by the method of averages the required relation is: 

i = 0.0160766° 6047655 
or 

log i = 0.6047655 log e + 2.2061708 


The computed values by this equation are given in Table 79, 
page 329. As an exercise the student should carry this problem 
through by averages, using four-place logarithms, and compare his 
results with ours. 

We shall now solve this exercise by the method of least squares. 
The exercise affords an excellent opportunity for illustrating a short 
method. Continuing Table 77, we shall employ the following sub- 
stitutions: 


X - 1.2041200 
0.3010300 


and 


y' - F + 1 


or 

X = 0.3010300x' + 1.2041200 and F = y' - 1 (7) 


Equation (6) will then become: 

2/' — 1 = 6(0.3010300x' + 1.2041200) + k 
or 

y' - (0.301036)2' + (1.204126 + k + 1) 



328 


NONLINEAR TRENDS: CURVE-FITTING 


or 

y' = mx' + k* (8) 

where m = 0.301036 and k! = 1.204126 + k + 1 
From Section 59 (p. 218) we find m and k r by: 


_ — Sz'S?/ 

m ~ nW - (2z') 2 

— hx’hx'y' 
riLx' 2 - (2x') 2 

We therefore continue Table 77 according to (7) and obtain 
Table 78. 

Table 78 


*' 

y' 

x ' 2 

*V 

- 3 

- 0.6108339 

9 

1.8325017 

- 2 

- 0.4317983 

4 

0.8635966 

- 1 

- 0.2441251 

1 

0.2441251 

0 

- 0.0689339 

0 

0.0000000 

1 

0.1122693 

1 

0.1122698 

2 

0.3010399 

4 

0.6020600 

3 

0.4821587 

9 

1.4464761 

0 

- 0.4593327 

28 

5.1010293 


We can now find m and k 
m = 0.301036 = = 0.18217961 

6 = 0.6051875 

k’ = 1.204126 + k + 1 = (28)( ~ ? 9 g? 93 — = ~ 0.06561896 

7 (.28; 

k = log a = 2.2056627 
a = 0.0160568 

Hence by the method of least squares the required relation is: 
i = 0.0160568e 0 - 9 ' 61875 


or 

log i = 0.6051875 log e + 2.2056627 



THE POWER FUNCTION 


329 


In the following table we show the computed values which have 
been found from the equation determined by the method of averages 
and from the equation determined by the method of least squares. 


Table 79 


Observed 

Values 


Computed Values 


By Least Squares 

By Averages 

e 

i 

log i 

i 

log i 

i 

2 

0.0245 

2.3878423 

0.0244 

2.3882234 

0.0244 

4 

0.0370 

2.5700219 

0.0372 

2.5702759 

0.0372 

8 

0.0570 

2.7520148 

0.0565 

2.7523285 

0.0565 

16 

0.0855 

2.9343811 

0.0860 

2.9343810 

0.0860 

32 

0.1295 

T.l 165607 

0.1308 

1.1164336 

0.1308 

64 

0.2000 

1.2987403 

0.1990 

1.2984862 

0.1988 

128 

0.3035 

1.4809199 

0.3027 

1.4805387 

0.3023 


EXERCISES 

1. Find an equation of the form Y = aX b for the data: 


X 

5 

7 

9 

15 20 

30 

40 

50 

Y 

1 

2 

3 

9 16 

37 

65 

100 

Find an 

equation 

of the form Y = 

= aX b 

for the data: 

X \ 

4 


8 

12 

16 

20 

24 

Y 

2.9 


23.0 

77.8 184 

360 

622 

Find an 

equation 

of the form Y = 

= aX b 

for the data: 

X 

10 


20 

30 40 

50 

60 


Y 

11 


31 

57 88 

122 

161 



4. If Y is the diameter of a tree in inches at age X years, the relation 
is y = aX b . For the following data, find the equation of the given type: 


X 

19 

58 

114 

140 

181 

229 

Y 

3 

7 

13.2 

17.9 

24.5 

33 


6. A body in sliding down a plane of length l feet attained a velocity 
of V feet per second. Find the relation V = al b for the data given in the 
table: 



330 


NONLINEAR TRENDS: CURVE-FITTING 


1 

19.9 

45.1 

67.5 

94.4 

109 

126 

V 

10.1 

15.2 

18.6 

22.0 

23.6 

25.4 


6. The quantity of water, Q pounds, discharged per second from a 
circular orifice in a tank, under a pressure head of h feet, was found by 
experiment to result in the following data. Find the equation of the 
type Q = ah b . 


h 

0.583 

0.667 

0.750 

0.834 

0.876 

0.958 

1.0 

Q 

7.00 

7.60 

7.94 

8.42 

8.68 

9.04 

9.34 


7 . At the following draughts, h feet, a particular vessel has the given 
tonnage, T, in salt water. Find the equation of the type T = ah b . 


h 

15 

12 

9 

6 

T 

2100 

1510 

1020 

590 


84. THE PARABOLA Y = aX 2 + bX + c 

Due to the fact that the quadratic parabola possesses a three- 
constant flexibility, it is very useful in graduating statistical data 
from many fields. Three constants are to be determined, and this 
can be done by (1) the method of selected points, (2) the method of 
averages, (3) the method of least squares, and (4) the method of mo- 
ments. 

To apply the method of selected points, we draw a curve among 
the plotted points which will pass as near as possible to each of them. 
If the curve happens to pass through three of the plotted points or 
through any other three points whose coordinates can be approxi- 
mately determined, we can substitute their coordinates in the given 
equation and solve the three resulting equations for a, b , and c. 
Of course the points so used should be chosen at the extreme and 
middle portions of the data. 

As previously stated, the method of averages assumes that the 
sum of the residuals is zero. That is: 

2 (F - aX 2 - 6X - c) = 0 
or 

aXX* + b2X+nc = 2y (9) 

In order to obtain three equations which can be solved for the un- 
knowns a, by and c, we divide our data up into three sets. For each 
set find n, 2X, 2X 2 , and 2 Y. Substitute in (9) and solve for a, 5, 
and c. 



THE PARABOLA 


331 


The method of least squares can be used to advantage with this 
curve. By proceeding as in Section 59 (p. 217), we can find three 
normal equations which can be solved for a, b, and c. Thus if p t is 
any residual: 

Pi = Y { - aX\ -bXi- c 

and 

2p 2 > = 2 (Yi - aX\ - bXi - c) 2 

The expression 2p? can be written as a quadratic in a, in 5, and in c. 
By imposing the condition that 2p? be a minimum upon each quad- 
ratic we find the normal equations: 1 

a2X* + &2X + cn = 2F 1 
dLX* + b xx* + c2X = 2J XY (10) 

a2X 4 + b2 X* + c2 X 2 = 2X*y ] 

Note that the first equation is merely the summation of the given 
function; the second is the summation of X multiplied into the given 
function, and the third is the summation of X 2 multiplied into the 
given function (see Exercise 16 at the end of this chapter). 

If the values of X are in arithmetic progression — that is, if AX 
is constant — we can choose our units in such a manner that 2X 
and 2A" 3 are zero. Further we may frequently use the relationships 
in Exercises 2, page 10, and 20b, page 22, to determine 2X 2 and 2X 4 . 
By these artifices, the solution by least squares is not so laborious as 
it might appear. 

A test for the use of the parabola is contained in the general 
theorem of Section 80 (p. 310). We shall quote here the theorem for 
our special case. 

Theorem: //, when AX is constant , A a F is also constant } the relation 
between the variables may be expressed by the equation : 

Y = aX 2 + bX + c (11) 

Illustrative Example. The following table gives the modulus of 
torsion of steel T, in kilograms per square centimeter, at various 
temperatures 6 in degrees Centigrade. 

1 A knowledge of the calculus would enable the student to write out such 
normal equations very easily. By setting the partial derivatives of 2 p\ with 
respect to c, 6, and a each equal to zero, equations (10) are obtained. 



332 


NONLINEAR TRENDS: CURVE-FITTING 


Table 80 


Ad 

e 

T 

AT 

A*T 


0 

8,290 



20 

20 

8,253 

- 37 

- 1 

20 

40 

8,215 

- 38 

- 1 

20 

60 

8,176 

- 39 

- 1 

20 

80 

8,136 

- 40 

-0 

20 

100 

8,096 

-40 



We note that Ad is constant (= 20) and that A 2 T is nearly con- 
stant, hence by the preceding theorem the data follow the law: 

T = ad 2 + bd + c (12) 

We shall use the method of least squares, and in order to shorten 
the work we shall use the substitutions: 


X = ~ io 5Q and Y = T ~ 8200 
or 

6 = 10X + 50 and T = Y + 8200 
Our equation (12) then becomes: 

Y = 4X 2 + BX + C 

where 

A = 100a 

B = 1000a + 106 = 104 + 106 I 
C = 2500a + 506 + c - 8200 [ 

C = 5B - 254 + c - 8200 

To form our normal equations we prepare the following table: 



THE PARABOLA 


333 


Table 81 


e 

T 

X 

Y 

X 1 

X 3 

A 4 

XY 

X 2 Y 

Commuted 

0 

8,290 

- 5 

90 

25 

- 125 

625 

- 450 

2,250 

8,290.1 

20 

8,253 

- 3 

53 

9 

- 27 

81 

- 159 

477 

8,252.9 

40 

8,215 

- 1 

15 

1 

- 1 

1 

- 15 

15 

8,214.9 

60 

8,176 

1 

- 24 

1 

1 

1 

- 24 

- 24 

8,176.0 

80 

8,136 

3 

- 64 

9 

27 

81 

- 192 

- 576 

8,136.3 

100 

8,096 

5 

— 104 

25 

125 

625 

- 520 

- 2,600 

8,095.8 

Total 

0 

- 34 

70 

0 

1,414 

- 1,360 

- 458 



Substituting the proper sums in equations (10) we have the follow- 
ing normal equations : 

6C + 704 — — 34 
70 B = - 1360 
70C + 14144 = - 458 

Solving, we obtain 

4 = - 0.10267857 B = - 19.428571 C = - 4.46875 
from which it follows, using (13), that: 

a = - 0.0010268 b = - 1.8402 c = 8290.11 

Hence our equation is: 

T = — O.OO1O2680 2 - 1.84020 + 8290.11 

Assigning the given values to 0 we obtain the computed values of 
T that are found in the last column of the Table 81. 

85. OTHER USEFUL CURVES 

In this chapter we have attempted to introduce the student to 
some of the methods of fitting simple curves to observed data. We 
have considered in great detail the methods of fitting the straight 
line, the exponential function, the power function, and the quadratic 
polynomial. 

We shall mention now with less detail a few additional well-known 
curves that are frequently found useful. 

A. The Hyperbola Y = a + - 1 (14) 

This equation represents the hyperbola with the lines Y = a and 
X = 0 as asymptotes. It can be written in the form: 



334 


NONLINEAR TRENDS: CURVE-FITTING 


y - a + 6(i), 

which is a straight line with slope b in the 



coordinates. 


Hence we can state the test: If AF/A(1/X) is constant, the data 
obey the law given by (14). 

It may be noted that if a = 0, we have as a special case the well- 
known: 

XY - b 

B. The Hyperbola Y = a ^ ( 15 ) 


This equation represents the hyperbola with the lines a + bX — 0 
and bY = 1 as asymptotes. We can write the equation in the form 

| = a + bX (16) 

which is that of a straight line with slope b in the (X, X/Y) coordi- 
nates. Hence we can state the test: If A(X/F)/AX is constant, 
the data may be represented by (15). 

The methods for determining the constants for (14) and (15) 
should be evident. 


C. The Modified Exponential Function Y = a + bc x . The fol- 
lowing theorem may be used to determine if the modified expo- 
nential law is applicable. 

Theorem: If the values of X are in arithmetic progression and the 
values of AY are in geometric progression , the data follow the law: 

Y = a + bc x (17) 


Table 82 


X 

Y 

AY 

x. 

Y, 

AY, 

Xi « X, + AX 

Y, - Y, + AYi 

AYt = tAY, 

Xi = X, + 2AX 

F,= Y,+ AY 1 ,+ rAY, 


. . . 


AF n _i = r"~ 2 AFi 

X n = X, + (» - 1)AX 

Y n =Y,+ AY, + rAY, 

+ • • • + r B_3 AFi 




OTHER USEFUL CURVES 


335 


From the hypothesis we have, since the values of AF are in geo- 
metric progression: 

F n = Yi + A Y x + rAYi + r 2 AFt + • • • + r^AY t 
Using the formula for the sum of a geometric progression, we have: 


F n = Fi + AFJ 


rLzirfi 

L 1 - r m 


r..y 1+ "L 

1 — r 1 — r 


Further, since the values of X are in arithmetic progression, we 
have: 

X n = X! + (n - l)AX 


n — 1 


X n - Xi 


Substituting this value of (n — 1) in (18) we have: 

Yn = F x + • rzr 


where 


F n = a + bc Xn 


a = Fi + 


1 - r 


AFi -X 

b = — - — r ax , c = r ax 

1 — r 


To determine the constants of this equation we shall employ the 
method of selected points. We draw the best-fitting curve among 
the points. We now choose three points on the curve whose co- 
ordinates are known — or can be estimated — and whose abscissas 
are in arithmetic progression. We can form three equations by 
substituting the coordinates of the selected points in (17), and solve 
for the unknowns. 

Exercise. A curve of the type F = a + bc x passes through the three 
points (1, 10), (3, 28), and (5, 100). What is its equation? 

For an illustrative example, consider the data of Table 83 on 
page 336. 



336 


NONLINEAR TRENDS: CURVE-FITTING 


Table 83 



We note that the values of X are in arithmetic progression, that 
the values of Ay are approximately in geometric progression, and 
conclude that our data follow the law: 

Y = a + bc* 

To determine the constants, assume that the curve passes through 
the points (4, 20.1), (8, 47.6), and (12, 66.8). We then have the 
equations: 

a + be* = 20.1 
a + be? = 47.6 
a + be 12 = 66.8 

Then 

bc*(c* - 1) = 47.6 - 20.1 = 27.5 
6cV - 1) = 66.8 - 47.6 = 19.2 

and by division we obtain: 

c* « 0.6982 
c - 0.9141 


By substitution we have: 



OTHER USEFUL CURVES 337 

6(0.6982) (0.6982 - 1) = 27.5 

and 

6 = - 130.5 

Now a is easily found, for: 

a + (- 130.5) (0.6982) = 20.1 

and 

a = 111.2 

Hence our equation is: 

Y = 111.2 - 130.5(0.9141)* 

Other selections of the points will give slightly different values. 
The computed values and the residuals may now be found. 

D. The Modified Power Function Y = c + aX b . A test for the 
applicability of this law is contained in the 


Theorem: If the values of X form a geometric progression and 
the values of AY also form a geometric progression, then the data obey 
the law: 



Y = c + aX b 

(19) 


Table 84 


X 

Y 

AY 

x, 

X t = rXi 

X t = r 2 Xi 

Y, 

Y 2 = 7, + AYi 

F 3 = Ti + AYi + RAYi 

AYi 

AY t = RAYi 

X n = *r“ _l Xi 

Yn m Yi + AYl + RAYi 

+ • • • + R^AYi 

AYn-i = R n ~*AYi 


From the hypothesis we have: 

X.-r-X, or n — 1 = log X \ - log 

log r 

Yn - Yi + AYl [1 + R + R* + • • • + fl-*] 


and 



338 

or 


NONLINEAR TRENDS: CURVE-FITTING 


y. - y, + AY i [1—^j 

The remainder of the proof easily follows, and we leave its com- 
pletion to the reader. 

As in the modified exponential, we determine the constants by 
the method of selected points but in this case the abscissas should 
be chosen in geometric progression . 

86. LIMITATIONS OF EMPIRICAL EQUATIONS 

In the preceding pages of this chapter we have been concerned 
with two fundamental questions that relate to empirical equations: 
first, what type of equation should be selected to describe the data, 
and, having decided upon the type of equation, the second question 
is, how can the constants be determined? Having answered the first 
question, the second presents no great difficulty. 

Once the equation for the data has been determined, we have an 
expression that may be used, within certain limits, to estimate values 
of the dependent variable and thus to compare values on the curve 
with observed values. Further, if a criterion of goodness of fit is 
desired, we may turn to the sum of the squares of the residuals. 

To assist in determining the type of equation to be selected we 
have devised tests to apply to the observations. The illustrative 
examples that we have solved have enjoyed a singular peculiarity; 
they have presented data for which the tests were closely satisfied. 
In general, the data have come from the laboratories of the physical 
sciences where it is possible to restrict the problem to a study of the 
variables in question, and to control or eliminate outside influences. 
There have been internal as well as mathematical reasons for se- 
lecting an equation of given type and thus our empirical equations 
have been “true relations” between the variables in question. 

When a physicist is analyzing a set of distance , time data of the 
flight of a projectile, he will know for internal reasons that his curve 
is a second degree parabola D = AT 2 + BT + C. Similarly, a 
chemist in analyzing pressure , volume data would likely choose 
P = AV B . As a result of slow and painful research, the scientist 
learns how certain phenomena behave. It frequently occurs that a 
study of empirical data leads to a formulation and discovery of rela- 



LIMITATIONS OF EMPIRICAL EQUATIONS 339 

tionships that the investigator had not been able to formulate from 
analytical considerations. A classic example of this method was the 
discovery and formulation of Kepler’s Laws which explain the mo- 
tions of the planets. These laws were formulated by Johann Kepler 
(1571-1630) after a study of a tremendous quantity of observed data 
collected over a number of years by the brilliant astronomer, Tycho 
Brahe (1546-1601). The truths hidden in the data were not revealed 
to the observer, Brahe, but when Kepler analyzed the data he saw in 
them relationships that he formulated into what are known as 
Kepler’s Laws. Science is replete with similar examples. 

When one moves outside the realm of physical science, he has 
difficulty in finding an equation that explains and expresses a “true 
relationship.” Internal evidence is lacking. Too many uncontrol- 
lable influences are present that cannot be eliminated, and thus our 
data may not lead to an analytical formulation of an inherent 
relationship. In biological, educational, economic, and social rela- 
tionships our knowledge is too limited to enable us to say why a 
relationship exists. The best we can do in these fields is to find 
a functional relationship between the variables in question for the 
particular data at hand. Generally, we cannot explain the why of 
the relationship. In such cases the data obviously may not reveal 
that a certain type of equation is indicated. Sometimes experience 
comes to the assistance of the investigator, otherwise he does what 
all of us do, namely, the best he can . 

Usually the purpose of this functional relationship is to estimate 
sufficiently well the values of one variable from known values of 
another, and frequently this purpose can be accomplished by using 
more than one type of equation. In fact, we can establish the functional 
relationship without an equation at all. If to each value of X there 
is determined one or more values of Y, then Y is a function of X . 
We may determine the values of Y from a graph, a table of values, 
and that is all that is really necessary. However, much is gained 
if we can obtain a summarizing expression in the form of an 
equation. 

We then face the practical problem of finding a functional relation- 
ship. If we choose to find an equation, the curve may fit poorly or 
closely. When the data are such that a careful analysis is warranted, 
they should be subjected to a careful analysis; however, should they 



340 NONLINEAR TRENDS: CURVE-FITTING 

not warrant a careful analysis, it is the height of absurdity to subject 
them to such a treatment. The investigator must determine the 
type of treatment the data merit. 

In our previous sections we have discussed methods of dealing 
with precise measurements in a precise manner. In fact, we have 
frequently used eight-place logarithms in our computations in order 
that our results might be the more precise. In the next section we 
discuss methods of dealing with data that may not merit a careful 
analysis. 

87. GRAPHICAL METHODS IN TREND ANALYSIS 

Frequently workers in practical statistics are confronted with data 
that do not warrant a careful algebraical and numerical analysis. 
Rough approximations may be sufficiently accurate for the investi- 
gator’s needs. In such cases he usually resorts to the use of graphical 
methods. Especially are graphs widely employed in trend analysis. 
Not only may the graph be used to give a clew to the equation 
of the curve that may be used to represent the trend; it may even 
be used to determine the unknown constants that appear in the 
equation that is selected. 

We are familiar with graphs made on the conventional cross- 
section coordinate paper. On this paper a given distance in any 
direction, when applied to a given problem, always represents a 
constant quantity. Such paper may be specifically called “ arith- 
metic paper,” and the uniform scale an “ arithmetic scale.” 

We may, however, develop scales on which equal distances do 
not always represent equal magnitudes. A very common and widely 
used scale of this kind is the “ logarithmic scale” on which equal 
distances represent equal proportional or percentage changes. In 
this scale the points correspond to the logarithms of numbers. By 
placing the natural numbers, N, and their logarithms, log N , into 

1 2 3 4 56789 10 A 




GRAPHICAL METHODS IN TREND ANALYSIS 341 


correspondence, it is noted that the logarithms are spaced uniformly 
along the line while the integers are spaced non-uniformly. 

The scale from 1 to 10 as shown on the line AB constitutes a 
cycle . Any number on the scale, say X, corresponds to log X . That 
is, the logarithmic scale serves the purpose of finding the logarithms . 
By prolonging the line AB and repeating the scale, we may con- 
struct a segment of two cycles. 

It is customary to assign a value to the initial point A. It may 
be any number greater than zero. The value to be assigned is 
determined by the problem at hand. The value placed at the end 
of the cycle, /?, is 10 times the value assigned to the point, A. Thus, 
the numbers along the following scale, AB ) serve as illustrations. 


A _ B 

1 2 3 4 5 6 * . ! .9 10 

2 4 6 8 20 

5 10 15 20 50 

13 26 39 52 130 


A. Arithmetic Paper. As an illustration of the use of the graph- 
ical method in determining the straight-line trend we shall consider 
the data that were given in Exercise 3, page 315. 


Table 85. Annual Earnings of the Associated Gas 
and Electric System, 1920-1928 


Year 

Net Earnings 
(millions of dollars) 

Year 

Net Earninas 
(millions of dollars) 

1920 

13.4 

1925 

29.5 

1921 

16.2 

1926 

33.5 

1922 

19.2 

1927 

37.8 

1923 

22.7 

1928 

40.6 

1924 

25.1 

1929 



We plot the data carefully on arithmetic coordinate paper with 
X = 0 at 1920 [Figure 44]. The observed points are indicated by 
the small crosses. We then sketch in “by sight” the line of trend. 
It cuts the F-axis at 12.5. By using this point and the point (8, 
40.5) as two known points on the line, we obtain 


40.5 - 12.5 
m 8 — 0 


= 3.5 


Hence we have the equation of the line F = 3.5X + 12.5. 



342 


NONLINEAR TRENDS: CURVE-FITTING 


Figure 44 



In general we proceed as follows: We plot the data carefully on 
arithmetic paper. Next, we draw in by sight the trend line. Then 
selecting two widely separated points A and B on the line, we 
evaluate the ratio of the difference in the ordinates to the difference 
of the abscissas of the two points. This gives us the slope, m, of the 
trend line. Using this slope with some point on the line whose co- 
ordinates are read from the graph, we can find from the point-slope 
form 

Y — Yi = m(X - X x ) 

the equation of the trend. Of course if the F-intercept can be 
determined from the graph, we may use the slope-intercept form 

Y = mX + b 

and thus find the equation of the trend line. 

Obviously this same method may be employed for parabolic, 
exponential, or other types of trend. We choose points equal in 
number to the number of constants in the equation, substitute the 
coordinates in the chosen equation, and solve for the unknowns. 

B. Semi-logarithmic Paper. Logarithmic scales may be used 
on the axes of coordinate paper. If the scale on one of the axes is 
logarithmic and on the other is arithmetic, the paper is called semi- 



GRAPHICAL METHODS IN TREND ANALYSIS 343 


logarithmic paper. This type of paper, usually with three cycles, 
can be purchased at stationery stores. It is used by statisticians in 
studying the growth of populations, bank clearings — in short, in 
studying data that may follow the exponential function Y — ab x . 

The following theorems contain the gist of the theory. 

Theorem 1. The graph of the exponential function Y = ab x 
plotted on semi-logarithmic paper is a straight line whose slope is 
log b and whose intercept on the non-uniform scale is a. 

Proof: From 

Y = ab x 

we have, taking logarithms, 

log Y = (log b)X + log a 

which is an equation of the first degree in the (X, log Y) coordinates, 
and consequently represents a straight line. The slope is log b and 
the vertical intercept is a on the log F-axis. That is, if the points 
(X, Y) plotted on uniform coordinate paper fall upon the curve 
Y = ab x , when plotted on semi-logarithmic paper they fall upon the 
straight line log F = (log b)X + log a. 

Conversely, if the points (X, F) when plotted on semi-logarithmic 
paper is a straight line with slope log b and with the intercept on the 
non-uniform scaled vertical axis a, the data follow the exponential 
law F = ab x . 

We shall leave the proof as an exercise for the reader. 

Example 1. Draw the graph of the curve F = 2 X on arithmetic paper 
and on semi-logarithmic paper. 

We prepare a table of values and plot the points as indicated. 


Figure 45(a) 



F = 2 X 


X 

Y 

0 

1 

1 

2 

2 

4 

3 

8 


Figure 45(b) 




344 


NONLINEAR TRENDS: CURVE-FITTING 


It is noted that the equation plots into a curve on the arithmetic paper 
and into a straight line on the semi-logarithmic paper. The equation of 
the straight line in the (X, log Y) coordinates may be written 

log Y = (log 2)X + log 1 

in which the slope is log 2 and the vertical intercept is 1. 


Figure 46 



EXERCISES 

1. Find the equations of the straight lines in Fig. 46 in semi-logarithmic 
form. What are the corresponding equations in exponential form? 

2. Plot the following equations on semi-logarithmic paper. 

(a) Y = 2(3) x (c) log Y = 0.5X + log 3 

(b) Y = 2(10) 2X (d) Y = 3(I0)“ 2A 

3 . If $10 is invested at 5 per cent compounded annually the amount Y 
at the end of X years is given by Y — 10(1.05) x . Plot this curve on semi- 
logarithmic paper. 

Let us next employ semi-logarithmic paper to determine graphi- 
cally the approximate exponential equation that obtains for a mass 
of empirical data. We illustrate the procedure in Example 2. 

Example 2. Find graphically the exponential trend of the gross earnings 
in millions of dollars of all Bell telephone companies in the United States 
as given in the accompanying table. 



GRAPHICAL METHODS IN TREND ANALYSIS 345 


Figure 47 


Table 86 


Year 

Earnings 

1921 

521 

1922 

564 

1923 

623 

1924 

678 

1925 

761 

1926 

845 

1927 

917 

1928 

1003 



We choose X = 0 at 1921. We note that the data, when plotted on 
semi-logarithmic paper, lie along the line BC which we draw by sight. 
For this line a = 520. Taking the point C(7,1000) as a second point on 
the line we have 


slope = log b = 


log 1000 - log 520 
7-0 


3.0000 - 2.7160 
7 

b — 1.1 approximately 


0.0406 


The equation of the straight line in semi-logarithmic coordinates is 
therefore 

log Y = 0.0406X + log 520 
and the corresponding exponential equation is 

Y = 520(1. 1)- Y 


EXERCISES 

1. The registration (in millions) of motor vehicles in the United States 
in the given years is shown by the following table. [ Statistical Abstract 
of the U.S., 1930, p. 385.] Using semi-logarithmic paper, find an exponential 
function that will approximately fit the data. 


Year 

Registration 

Year 

Registration 

1917 

5.0 

1922 

12.2 

1918 

6.1 

1923 

15.1 

1919 

7.6 

1924 

17.6 

1920 

9.2 

1925 

19.9 

1921 

10.5 

1926 

22.0 



346 


NONLINEAR TRENDS: CURVE-FITTING 


2. Use semi-logarithmic paper to fit an exponential curve to the fol- 
lowing data which give the average number of shares (in millions) sold 
on the New York Stock Exchange from 1919 to 1929 inclusive. 


Year 

Sales 

Year 

Sales 

1919 

26.07 

1925 

37.69 

1920 

18.73 

1926 

37.42 

1921 

14.30 

1927 

48.08 

1922 

21.73 

1928 

76.71 

1923 

19.77 

1929 

93.75 

1924 

23.50 




3. Use semi-logarithmic paper to fit an exponential curve to the follow- 
ing data which give the production (in millions of barrels) of petroleum in 
the United States 1920-1929. [Statistical Abstract of the United States, 
1936, p. 723.] 


Year 

Production 

Year 

Production 

1920 

443.0 

1925 

763.7 

1921 

472.2 

1926 

770.9 

1922 

557.5 

1927 

901.1 

1923 

732.4 

1928 

901.5 

1924 

713.9 

1929 

1007.3 


C. Logarithmic Paper. Thus far we have used two types of 
coordinate paper in our work, arithmetic and semi-logarithmic. In 
the arithmetic paper, the scale along both axes is the natural scale. 
The semi-logarithmic paper has the natural scale along the axis of 
abscissas and a logarithmic scale along the axis of ordinates. 

Another useful type of paper is logarithmic paper. This paper is 
ruled with logarithmic scales both horizontally and vertically. It 
is frequently called double logarithmic and log-log paper. When a 
point ( X , Y ) is plotted on log-log paper, its actual distances from 
the reference lines are proportional to log X and log Y. In other 
words, in graphing pairs of numbers on logarithmic paper we really 
graph the logarithms of the numbers. The logarithmic paper serves 
the purpose of finding the logarithms of the numbers. The effect 
of this is to tone down the contrasts. For examples, 

log 1000 is only 3, and log 0.0001 is — 4. 



GRAPHICAL METHODS IN TREND ANALYSIS 347 


Double logarithmic paper is very useful in studying the power 
function 


F = aX b 


where a and b are constants. This is due to the fact that the graph 
of the power function on logarithmic paper is a straight line. For 
we have the 

Theorem : The graph of the power function 

Y = aX b 

plotted on logarithmic paper is the straight line whose slope is b 
and whose intercept on the F-axis is a. 

Proof : Taking logarithms of the above equation, we have 

log Y = b log X + log a 

which is an equation, in logarithmic coordinates, of a straight line 
with slope b and F-intercept a. 

Conversely, if the (X, F) data when plotted on logarithmic paper 
give a straight line with slope b and F-intercept a, the data follow 
the law 

F = aX* 

Proof : The (log X, log F) relation is linear. Hence 
log Y — b log X + log a 
which can be immediately reduced to 

F = aX b 



2 S 


10 X 




348 


NONLINEAR TRENDS: CURVE-FITTING 


Example 1. Draw the graph of Y — 3X 2 on logarithmic paper. 

Since the graph is a straight line, we need but two points say (1, 3) and 
(2, 12) to determine the constants. These two points determine the line 
AB of Figure 48. 

The slope b is given by 

^ _ log 12 — log 3 
log 2 — log 1 

_ log 4 _ 2 log 2 
log 2 log 2 

6 = 2 

From the figure a — 3, hence the log-log equation is 
log Y = 2 log X + log 3 


Figure 49 



Example 2. Find the equation of the curve that graphs into the line 
marked c of Figure 49. 

We have 


slope 


log 8 — log 4 
log 10 — log 1 


log 2 


= log 2 


b = log 2 


From the figure a = 4, and the log-log equation is 
log Y = (log 2) log X + log 4 



GRAPHICAL METHODS IN TREND ANALYSIS 349 


Using the properties of logarithms we obtain 
Y = 4X /o * 2 = 4X- 3010 


Example 3. Using log-log paper find the equation that approximately 
fits the data: 


X 

1 

3 

5 

7 

10 

20 

40 

60 

100 

Y 

25 

45 

60 

70 

90 

130 

190 

240 

300 


Figure 50 



In solving this problem we find it necessary to use two cycle log-log 
paper. We indicate the points by small crosses. Since the points lie ap- 
proximately upon the straight line AB , the data may be approximately 
represented by Y = aX b . Assume that the line passes through the points 
A (1,25) and £(100,300). 


b 


slope = 


log 300 — log 25 
log 100 — log 1 


log 12 
2 


1.0792 

2 


0.54 


From Figure 50 the F-intercept = a = 25. Hence the log-log equa- 
tion is 

log y = 0.54 log X + log 25 
from which we immediately obtain 

7 = 25Z°- M 



350 


NONLINEAR TRENDS: CURVE-FITTING 


EXERCISES 

1. Find the log X, log F and the X , Y equations of the lines a, 6, d, e, /, 
and g of Figure 49, 

2. Use log-log paper to determine the log X , log Y and the X, Y equations 
for the data given in the table: 


X 

4 

8 

12 

16 

20 

24 

Y 

2.9 

23.0 

77.8 

184.0 

360.0 

622.1 


3. Use log-log paper to determine the approximate X, Y relation for 
the data given in the table: 


X 

10 

20 

30 

40 

50 

60 

Y 

11 

31 

57 

88 

122 

161 


4 . Plot the following data on two-cycle log-log paper and determine 
the log X, log Y and the X, Y relations. 


X 

5 

7 

9 

15 

20 

30 

40 

50 

Y 

1 

2 

3 

9 

16 

37 

65 

100 


5. Solve Exercise 4 above using the method of averages for the (log X, 
log Y) straight line. 

6. Use semi-log paper to find the X, log F and the X, F equations for 
the data: 


X 

1.6 

3.1 

4.7 

6.3 

7.9 

9.4 

11.0 

F 

5.4 

7.2 

9.6 

12.8 

17.1 

22.9 

30.8 


7 . The number N of bacteria in a given culture t hours after they were 
first observed was found to be that given by the table. Using semi-log 
paper find N in terms of t. 


t 

0 

1 2 

3 

4 

5 

6 

N 

125 

209 340 

561 

924 

1525 

2512 

8. The number N of bacteria in a culture at the end of t hours is shown 

r the following table. 

Use semi-log paper to find N in terms of t . 

t 

0 

1 2 

3 

4 

5 

6 

N 

100 

162 265 

450 

742 

1230 

2020 



GRAPHICAL METHODS IN TREND ANALYSIS 351 


9 . The annual expenditure of the United States Government (in millions 
of dollars) has increased as in the table. Use semi-log paper to determine 
the appropriate law. Would you advise using this law to extrapolate for 
the expenditure in 1918? 


Year 

Expenditure 

Year 

Expenditure 

1840 

24 

1880 

265 

1850 

41 

1890 

298 

1860 

63 

1900 

488 

1870 

294 

1910 

660 


10 . The total assets (in billions of dollars) of Building and Loan Asso- 
ciations in the United States for the given years are shown in the following 
table. Use semi-log paper to find the X, log Y and the X , Y equations. 


Year 

X 

Assess Y 

Year 

X 

Assets Y 

1920 

0 

2.52 

1925 

5 

5.51 

1921 

1 

2.89 

1926 

6 

6.33 

1922 

2 

3.34 

1927 

7 

7.18 

1923 

3 

3.94 

1928 

8 

8.02 

1924 

4 

4.77 

1929 

9 

8.70 


11 . The following table gives the average monthly imports of wood 
pulp (millions of short tons) into the United States for the given years. 
Choose X — 0 at 1926 and find the straight-line equation by selected 
points. Extrapolate for the years 1931, 1932, and 1933. The actual 
imports these years were 133.0, 123.5, and 161.8 short tons. 


Year 

Imports 

Year 

Imports 

1922 

105 

1927 

140 

1923 

115 

1928 

147 

1924 

127 

1929 

157 

1925 

139 

1930 

152 

1926 

145 




12. The following table gives the production of women’s shoes (in 
millions of pairs) for the given years. Plot the data on semi-logarithmic 
paper and determine the X, log Y and the X , Y relations using X = 0 
at 1931. Find the extrapolated value for 1940. The actual value was 
12.5 million pairs. 



352 


NONLINEAR TRENDS: CURVE-FITTING 


Year 

X 

Production 

Year 

X 

Production 

1931 

0 

9.4 

1936 

5 

13.5 

1932 

1 

9.5 

1937 

6 

12.5 

1933 

2 

10.9 

1938 

7 

12.3 

1934 

3 

11.1 

1939 

8 

14.0 

1935 

4 

12.1 

1940 

9 



13 . The following table gives (in millions of pounds) the domestic 
consumption of rayon in the United States from 1920 to 1936. Plot on 
semi-logarithmic paper with X ~ 0 at 1920, and find graphically the X, 
log Y and the X, Y equations. Find the extrapolated value for 1937. 
The actual value was 261.2 million pounds. 


Year 

X 

Consumption 

Year 

X 

Consumption 

1920 

0 

9 

1929 

9 

131 

1921 

1 

20 

1930 

10 

118 

1922 

2 

25 

1931 

11 

157 

1923 

3 

33 

1932 

12 

152 

1924 

4 

42 

1933 

13 

212 

1925 

5 

58 

1934 

14 

195 

1926 

6 

61 

1935 

15 

253 

1927 

7 

100 

1936 

16 

298 

1928 

8 

100 

1937 

17 



14 . Plot the data of Exercise 13 above on arithmetic paper and use 
the method of selected points to find the equation of the parabola Y — AX 2 
+ BX + C that will approximately fit the data. Choose X = 0 at 1920. 
Extrapolate for 1937. 

15 . The following table gives the annual production of cigarettes (bil- 
lions) in the United States in the given years. Use semi-logarithmic paper 
to find the X , log Y and the X, Y equations. Choose X = 0 at 1920, and 
let Y — Production. Find the extrapolated value for 1930. The actual 
value for 1930 was 123.8 billions. 


Year 

Annual Production 
(billions) 

1920 

47.4 

1921 

52.1 

1922 

55.8 

1923 

66.7 

1924 

72.7 


Year 

Annual Production 
(billions) 

1925 

82.2 

1926 

92.1 

1927 

99.8 

1928 

108.7 

1929 

122.3 



GRAPHICAL METHODS IN TREND ANALYSIS 353 


16. Find the trend line for the changing price of beef as described in the 
data of Table 12 (p. 47). 

17. In the following table the unit is 1,000,000 barrels of 42 gallons. 


Annual Production of Petroleum in the 
United States, 1900-1913 


Year 

Production 

Year 

Production 

Year 

Production 

1900 

63.6 

1905 

134.7 

1910 

209.6 

1901 

69.4 

1906 

126.5 

1911 

220.4 

1902 

88.8 

1907 

166.1 

1912 

222.9 

1903 

100.5 

1908 

178.5 

1913 

248.4 

1904 

117.1 

1909 

183.2 




Find the equation of the trend line, the computed values of the produc- 
tion for the given years, and the residuals. Find the predicted values for 
the years 1915 and 1920 and compare your results with those given in 
Commerce Yearbook , 1930, page 293, which are as follows: 1915, production, 
281.1; 1920, production, 442.9. What can you say for the trend line for 
purposes of prediction? 

18. Find the equation of the trend line (a) excluding the years 1916, 
1917, and 1918, and (b) including these years. Find the computed produc- 
tion and the residuals in each case. 


Average Monthly Production of Pig Iron in the 
United States, 1903-1918 1 


Year 

Production 
( 1000 
long tons) 

Year 

Production 
(1000 
long tons) 

Year 

Production 
(1000 
long tons) 

1903 

1,452 

1909 

2,116 

1914 

1,921 

1904 

1,344 

1910 

2,237 

1915 

2,472 

1905 

1,882 

1911 

1,944 

1916 

3,252 

1906 

2,066 

1912 

2,448 

1917 

3,182 

1907 

1908 

2,109 

1,302 

1913 

2,560 

1918 

3,209 


19. Find the equation of the trend line, the computed values of the 
production, and the residuals. 


1 The data are taken from Review of Economic Statistics , Vol. I, p. 66; United 
States Department of Commerce, Survey of Current Business, No. 42, p. 44. 



354 


NONLINEAR TRENDS: CURVE-FITTING 


Total Production of Crude Steel, 1900-1929 1 


Year 

Production 
( millions of 
long tons) 

Year 

Production 
( millions of 
long tons) 

Year 

Production 
( millions of 
long tons) 

Year 

Production 
( millions of 
long tons) 

1900 

10.6 

1908 

14.0 

1916 

42.8 

1923 

44.9 

1901 

13.5 

1909 

24.0 

1917 

45.1 

1924 

37.9 

1902 

14.9 

1910 

26.1 

1918 

44.5 

1925 

45.4 

1903 

13.9 

1911 

23.7 

1919 

34.7 

1926 

48.3 

1904 

13.9 

1912 

31.3 

1920 

42.1 

1927 

44.9 

1905 

20.0 

1913 

31.3 

1921 1 

19.8 

1928 

51.5 

1906 

23.4 

1914 

23.5 

1922 

35.6 

1929 

56.4 

1907 

23.4 

1915 

32.2 






88. GOODNESS OF FIT OF CURVES TO OBSERVED 
DATA: NONLINEAR CORRELATION 

A. Goodness of Fit. The investigator who takes the time to 
derive an empirical formula for a set of observed data is naturally 
interested in knowing how well the curve fits the observations. He 
therefore will always find the computed values of the dependent 
variable by his formula, and usually the F-residuals if Y is the 
dependent variable. 

Any F-residual, it will be recalled, is given by pi where 
Pi — the observed Yi — the computed Yi 


The variation in the residuals may be measured by their mean 
deviation or by their standard error. That is by: 

M.D. of p = 

n 



If the constants have been found by the method of selected points 
or by the method of averages, the mean deviation is adequate, but 


1 The data are taken from Statistical Abstract of the United States , 1918, p. 251; 
ibid., 1930, p. 756. 



GOODNESS OF FIT 


355 


if the constants have been determined by the method of least squares, 
S y is the natural measure. In either ease the results will be expressed 
in the given Y unit. 

In Section 63 (p. 238) while evaluating S v for the line Y — mX + 6, 
which has been fitted by least squares, we found 


where r = 


2 xy 


Sy = (7yVl — r 2 
is the cross-product formula for measuring linear 


71(7 Y 

correlation. We also found r to be an excellent measure of the good- 
ness of fit of the points to the derived line. If the formula above is 
solved for r, we have 

= r = v / 3 - — 


correlation based upon the straight line 


cry 


( 20 ) 


where Sy is the standard error of estimate based upon the straight 
line. 

B. Nonlinear Correlation. The process of arriving at a coefficient 
of correlation based upon curvilinear regression is comparatively 
simple in principle but often becomes very complex in practice. To 
emphasize the evident simplicity of the process let us proceed exactly 
as we did in Section 63 and find a coefficient of correlation based 
upon the parabola 

y = ax 2 

where x and y are deviations of X and Y from their respective means, 
M x and M y . Since any ?/-residual is given by 


we have 


Pi = y%- ax\ 

Xp\ = a 2 2 x\ - 2aXx\iji + 2 y\ 


which is a quadratic in a. Now 2 p\ is a minimum when 

— (— 22 x 2 y) _ 2 x 2 y 

a ~ 


Hence the best-fitting curve is given by 



where a is computed, of course, from the observed values. 



356 NONLINEAR TRENDS: CURVE-FITTING 


For this value of a, the sum of the squares of the residuals becomes: 

- HM ' Mid + w 

So 2 , 

and SI = becomes: 

¥ n 



Now evidently a coefficient of correlation based upon the parabola 
y = ax 2 is the expression: 


Note that in this case 

the coefficient of correlation based on the 


parabola 




where S v is the standard error of estimate for the parabola. 

In order to emphasize that this simple method will not always 
work, let the student undertake its application to the curves: 

y = x a and y — a x 

He will soon discover that “the method is simple in principle but very 
complex in practice.” 

However, for any curve which can be fitted to observed data we 
can always find S u by the definition: 




-v/ 


Z[F observed — Y computed] 2 
the number of observations 


and we can find <r Y by the methods of Chapter 4. We can therefore 
define as a measure of correlation based upon any such curve the 
function 1 


measure of correlation based upon any curve 



SI 

cry 


where S v is the standard error of estimate for the curve, and cr Y 
is the standard deviation of the given Y measures. This measure 
of correlation has been called the index of correlation , and is denoted 
by: 

Pxy 


1 For a test of linearity of regression, see Rietz and others, op. dt. } p. 131. 



GOODNESS OF FIT 


357 


The limits of p XY are 0 and 1, a value of 0 indicating no relationship 
based upon the given function and a value of 1 denoting perfect 
relationship. In general: 

No positive or negative sign should be attached to p XY , for the relation- 
ship might be positive over part of the range and negative over other parts. 1 2 

If the given curve is a straight line, then 

Pxy = r XY 

It seems hardly necessary to state that if correlation is measured by 
p XY , the curve to which it applies should always be stated. In the 
case of r no statement is necessary for it is generally understood 
that r is based upon linear regression. 


EXERCISES 

1. Fit a straight line to the data of the following table. 

Patients in New York State Hospitals for the 
Insane, 1910-1931 2 


Year 

Number of 
Patients per 
1,000,000 
Population 

Year 

Number of 
Patients per 
1,000,000 
Population 

1910 

35.6 

1922 

40.2 

1913 

36.9 

1925 

41.6 

1916 

38.1 

1928 

43.3 

1919 

38.8 

1931 

45.0 


2 . In a certain gas-pressure experiment the following results, in which 
V is the volume corresponding to the pressure p, were obtained. Fit an 
appropriate curve to the data. 





50.48 

59.30 

67.08 

74.36 




7.55 

6.47 

5.65 

5.07 


3 , The following table gives the number of divorces per 1,000 marriages 
during the given years. Fit a curve of the type Y = aX % + bX + c to 
the data. (Choose X = 0 at 1910.) 

1 F. C. Mills, Statistical Methods , Revised, p. 408. 

2 The data are from World Almanac , 1932, p. 534. 









358 


NONLINEAR TRENDS: CURVE-FITTING 


Divorces in the United States, 1890-1930 1 


Year 

Number of 
Divorces per 
1,000 
Marriages 

Year 

Number of 
Divorces per 
1,000 
Marriages 

1890 

62 

1915 

104 

1895 

67 

1920 

134 

1900 

81 

1925 

148 

1905 

84 

1930 

170 

1910 

88 




4 . The following table gives the number of divorces per 1,000 popula- 
tion during the given years. Fit a curve of the type Y = ab x to these 
data. What are the computed values for the years 1915 and 1928? The 
actual values were 1.05 and 1.66. 


Divorces in the United States, 1870-1930 2 


Year 

Number of 
Divorces per 
1,000 

Population 

Year 

Number of 
Divorces per 
1,000 

Population 

1870 

0.28 

1910 

0.90 

1880 

0.39 

1920 1 

1.60 

1890 

0.53 

1930 

1.56 

1900 

0.73 




6. The following table gives the number of grams S of anhydrous am- 
monium chloride which, dissolved in 100 grams of water, makes a saturated 
solution of 6° absolute temperature. Fit an appropriate curve to the data. 


e 

273 

283 

288 

293 

313 

333 

353 

373 

s 

29.4 

33.3 

35.2 

37.2 

45.8 

55.2 

65.6 

77.3 


6. The velocity of water in feet per second in the Mississippi river 
was measured at various depths, and the ratios, D y of the measured 
depth to the depth of the river were computed. Fit a curve of the type 

1 The data are from Statistical Abstract of the United States , 1932, p. 87. 

2 The data are from World Almanac, 1932, p. 444. 






GOODNESS OF FIT 359 

V — aD 2 + bD + c. Find the computed V when D = 0.9. The ob- 
served value was V — 2.9759. 



BOB 


j 

mm 


BQH 

■ 




hHI 

EjEumul 




BfaWWSTSBlii 

mm 

Wffilif 



7 . The following table gives the temperature 6 of a vessel of cooling 
water at the end of t minutes. Show that the data may be appropriately 
fitted to 6 — c + ab l . Find the values of a, b, and c and the computed 
values of 6. 


t 

0 

1 

2 

3 

5 

7 

10 

15 

20 

e 

92.0 

85.3 

79.5 

74.5 

67.0 

60.5 

53.5 

45.0 

39.5 


8. For the data of the following table find the exponential curve which 
appropriately describes the trend. Find the amount in force in 1930 
computed by the trend and compare the result with 107.9, which was the 
actual value. 

Life Insurance in Force in the United States, 1880-1928 1 


Year 

Total Amount 
(billions) 

Year 

Total Amount 
(billions) 

1880 

1.6 

1915 

22.8 

1890 

4.0 

1920 

42.3 

1900 

8.6 

1925 

71.7 

1905 

13.4 

1928 

95.2 

1910 

16.4 

1930 



9 . The indicated horse-power, /, required to drive a ship of displace- 
ment D tons at a ten-knot speed is given by the following data. Justify 
the use of the curve I = aD b . Fit this curve to the data. 


D 

1,720 

2,300 

3,200 

4,100 

I 

655 

789 

1,000 

1,164 


10 . For the data of the following table fit a parabola Y — aX z + bX 2 
+ cX + d. (Choose X = 0 at 1920.) Use the derived formula to predict 
the number of failures in 1931, and compare with the actual number, 28.3. 

1 The data are from Statistical Abstract of the United States, 1932, p. 283. 









360 


NONLINEAR TRENDS: CURVE-FITTING 


Commercial Failures in the United States, 1910-1930 1 


Year 

Number of Failures 
(i thousands ) 

Year 

Number of Failures 
( thousands ) 

1910 

12.6 

1921 

19.7 

1911 

13.4 

1922 

23.7 

1912 

15.5 

1923 

18.7 

1913 

16.0 

1924 

20.6 

1914 

18.3 

1925 

21.2 

1915 

22.2 

1926 

21.8 

1916 

17.0 

1927 

23.1 

1917 ; 

13.9 

1928 

23.8 

1918 

10.0 

1929 

22.9 

1919 

6.5 

1930 

26.4 

1920 

8.9 

1931 



11 . Using the method of Section 63 (p. 237), show that a coefficient of 
correlation based upon the parabola Y = aV'X is 

V2XZY 2 

12 . Show that a coefficient of correlation based upon the equilateral 

S! 

hyperbola xy = a is — ====== • 

13 . Find the correlation coefficient based upon xy — a for the data of 
Table 55 (p. 242). Compare your result with the correlation based upon 
linear regression. 

14 . Fit an appropriate curve to the data of Exercise 18 (p. 106). 

16 . What law will satisfactorily represent the following data? Find the 
values of the constants for the curve selected. 


X 

y 

X 

y 

2 

12.83 

8 

19.95 

3 

13.48 

9 

22.31 

4 

14.28 

10 

25.24 

5 

15.28 

11 

28.87 

6 

16.52 

12 

33.37 

7 

18.05 

13 

38.44 


1 The data are from Statistical Abstract of the United States, 1932, p. 295. 



GOODNESS OF FIT 


361 


16 . Show that if Y = aX 2 + bX + c is selected to represent a mass 
of observed data, the equations for the determination of the coxistants by 
the method of moments (see Section 69B, p. 219) are those given by (10). 

17 . The curve Y = c + aX b passes through the points (2, 11.5), (4, 
18.8), and (8, 39.7). Determine a, 6, c. Find Y when X = 5. 

18 . The curve Y — c + ob x passes through the points (2, 5.3), (4, 12.8), 
and (6, 30.2). Determine a, 6, c. Find Y when X = 3. 

19 . Does the point (M Xy My) lie on the parabola Y = AX 2 + BX + C 
if it is fitted by least squares? 



Chapter 1 1 

PERMUTATIONS, COMBINATIONS, AND 
PROBABILITY 

89. INTRODUCTION 

In Section 2 of this text we indicated that the solution of a general 
statistical problem may be divided into four parts: (1) the collection 
of the data, (2) its organization, (3) its analysis, and (4) the inter- 
pretation of the results of the analysis. The earlier chapters have 
been devoted primarily to the steps of organization and analysis. 
Given masses of numerical data, we have learned to present them in 
suitable tabular form, to represent them with graphic devices which 
emphasize some of the significant features, and to effect numerical 
analyses the results of which — when properly interpreted — present 
numerical descriptions of the groups. 

In our previous discussion we have analyzed a large number of 
frequency distributions that were derived from several fields: biology, 
education, sociology, economics, psychology, engineering. Each 
distribution has presented a specific problem and has been analyzed 
as a specific problem. We have thus far made but little attempt at 
generalization. Our method has been the method of science: ob- 
servation, classification, analysis. We now approach the final step, 
generalization. 

In order to extend our method beyond the analysis of a specific 
group of data, we are now about to enter upon a study of problems 
that are rather theoretical. It must not be assumed that because 
the problems are theoretical they are impractical. We shall find 
that they are decidedly practical. The first theoretical problem to 
which we shall give attention will be the development of some 
general laws to describe frequency distributions, the point binomial 
and the normal curve, that are usually spoken of as laws of chance. 
We shall then be in a position to compare theory with observation 
and to determine whether the differences between theory and ob- 

362 



INTRODUCTION 363 

servation are such as may be accounted for by causes other than 
chance. 

The reader has doubtless noted that statistical measurements when 
gathered in fairly large numbers, although possessing considerable 
variation, show a quality of orderliness that is at times amazing. As 
we pass along the scale of measurement of a variable, from the 
smallest magnitude to the largest, we find orderliness in the change in 
the frequency . Most commonly the frequency, relatively small at 
the lower end of the range, increases regularly until a maximum is 
reached in the central portion of the range then diminishes regularly 
toward zero at the upper end of the range. 

This behavior in variation in observed phenomena was first ap- 
preciated by the mathematical astronomer, Pierre Simon Laplace, 
(1749-1827) to the degree that he expressed the behavior by a 
mathematical function known as the normal law. The law had been 
previously discovered by the mathematician, Abraham de Moivre, 
(1667-1754) in 1733 as an adventure in pure mathematics to explain 
the probabilities of games of chance. Carl Friedrich Gauss (1777- 
1855) made use of it and thus gave it the approval of a very great 
mathematician. The application of this function to biological vari- 
ations was soon appreciated by the Belgian scientist, Adolphe Quetelet 
(1796-1874). The normal law has thus become a foundation stone 
in the modern statistical structure. That it would someday be used 
in the solution of biological, social, and economic problems and be 
invoked in countless investigations of the sciences was of course 
never dreamed or imagined by its discoverer. 

The second theoretical problem, one to which we have alluded 
several times in the text and to which we shall devote further atten- 
tion, is what may be termed the problem of sampling. We have seen 
that we may describe a mass of quantitative data as precisely as 
we please by computing for the data certain statistical constants. 
These constants give a condensed description of the group in terms 
of the group’s characteristics. Among the tremendous gains realized 
by this summary, not the least important is this: the summary 
makes possible the comparison of the characteristics of the individual 
with the characteristics of the group of which he is a part. 

This group that is measured and analyzed is usually a sample , 
a small part of a larger universe or parent population that is impossible 



364 


COMBINATIONS AND PROBABILITY 


or impracticable to measure. Generally, we desire to use the results 
of the study of the sample to make estimates of the constants that 
statistically describe the universe. This process is called statistical 
inference or statistical induction. It is the problem of inferring the 
characteristics of the universe from the characteristics of the sample, 
and measuring the reliability of the inferences. This problem may 
be stated as a question: To what extent is M, a, or any other con- 
stant computed from a sample of N observations randomly made 
from a universe trustworthy as the mean, standard deviation, or 
other value of the universe? The answer to this question constitutes 
what we term the interpretation of statistical results. 1 

Statistical induction is literally permeated with questions that re- 
late to the theory of probability, and in order to understand enough 
of this science to appreciate its widespread applications we shall 
now introduce the student to the simplest ideas of the theory. 

In the present chapter we shall consider certain elementary notions 
of probability. These notions we shall approach along the avenue 
of permutations and combinations. We shall undertake to give a 
thorough and much needed drill in a number of important algebraic 
concepts which will find repeated application in the chapters that 
follow. Permutations and combinations will lead us to the point 
binomial, which in turn will serve to introduce us to the normal 
probability curve. Thus we start with the notion of a permutation. 


90. PERMUTATIONS 

A permutation is an order or an arrangement of all or a part of a 
number of things. 

Thus, the permutations of the three letters a, b , c, taken all at a 
time are: ab c, a cb, b a c, b c a, cab, cb a. 

It is seen that 3 objects can be arranged linearly in 3 • 2 = 6 dif- 
ferent ways. We might reason in the following manner. There are 
3 places to be filled. The first place can be filled in 3 ways, and 
with each of these the second place can be filled in 2 ways. Hence 
the 2 places can be filled in 6 ways. With each of these 6 ways of 

1 So important is this aspect of our study that some writers devote practically 
their entire treatments to it. For examples, see the texts by R. A. Fisher and 
by Alan E. Treloar which are listed in Appendix A. 



PERMUTATIONS 365 

filling the first 2 places there is 1 way of filling the last place, hence 
3-2-1 ways in all. 

This example illustrates the following: 

Fundamental Principle. If one thing can be done in m ways, and 
if) after this is done in one of these ways, a second thing can be done 
in n ways, then the two together can be done in mn ways. 

The foregoing principle may be extended into the 

Theorem. If one thing can be done in mi ways, a second in m 2 
ways, a third in m 3 ways , and so on, the number of different ways in 
which they can be done when taken all together in the order stated is 
>****■■■■ 

Example 1. How many (th ree-digit n umbers ^can be formed from the 
digits 1, 2, 3, 4, 5 if each digit is used only once? 

The first place can be filled in 5 ways, and after that is done the second 
place can be filled in 4 ways, and then the third place in 3 ways. Hence, 
we can form 5 • 4 • 3 = 60 different numbers of the specified kind. 

Example 2. How many three-digit numbers can be formed from the 
digits 1, 2, 3, 4, 5 if each digit can be repeated? 

The first place can be filled in 5 ways, and after that is done the second 
place can be filled in 5 ways, and then the third place in 5 ways. Hence, 
we can form 5 • 5 • 5 = 125 different numbers of the specified kind. 

Example 3. How many three-digit even numbers can be formed from 
the digits 1, 2, 3, 4, 5 if each digit is used only once? 

The unit’s place can be filled in two ways (either with the 2 or 4). The 
ten’s place can then be filled in 4 ways and the hundred’s place in 3 ways. 
In all, there are 3 • 4 • 2 = 24 numbers of the specified kind. 

Example 4. In an introductory course in statistical analysis there are 
four lecture sections, A, B , C , D, and three laboratory sections, X , Y, Z. 
In how many ways may a student choose a section in each? 

He may choose the lecture section in 4 ways and the laboratory section 
in 3 ways. He may choose both in 4 • 3 = 12 ways. 

Question. In an election there are three candidates for mayor and 
four candidates for treasurer. In how many ways can a ballot be marked 
for both of these offices? 


EXERCISES 

-1. If 2 coins are tossed, in how many ways can they fall? 
2 . If 3 coins are tossed, in how many ways can they fall? 



366 


COMBINATIONS AND PROBABILITY 


~ 8 . If 2 dice are thrown, in how many ways can they fall? 

--4. If 2 dice and 3 coins are tossed, in how many ways can they fall? 
-5. How many signals can be made by hoisting 3 flags if there are 9 
different flags from which to choose? • 

6. In how many different ways can 3 positions be filled by selections 
from 15 different people? 

7. How many four-digit numbers can be formed from the numbers 1, 2, 
3, 4, 5, 6, 7, 8, 9?. 

91. NUMBER OF PERMUTATIONS 


In the preceding section we wrote the permutations of the three 
letters a, 6, c, taken all at a time. We may also write the permu- 
tations of the same three letters taken two at a time. They are 
aby ac } ba , be, ca } cb. 

Now let us consider the general problem : the number of permuta- 
tions of n things taken r at a time (r S n). The number of permuta- 
tions of n things taken r at a time is denoted by n P r and is given 
by the formula: 1 

„P r = n(n - 1 )(n - 2) • • • (n - r + 1) = ^ - (l) 


There are r places to fill and n things from which to choose. The 
first place may be filled in n ways, the second in (n — 1) ways, 
the third place in (n — 2) ways, and so on. The rth place may be 
filled in (n — r + 1) ways. Applying the theorem of Section 90, 
we immediately have (1). 

If all n things are taken n at a time, n = r, and we have: 


n P„ = n(n - 1 )(n - 2) 


3-2-1 = n! 


( 2 ) 


Since „P» = n ! 


tv ’ 71/ 

t — - ' -—- 7 - . - = > the use of the second form of (1) 

(w — n) ! 0 ! 


when n - r requires that we define 0 ! to equal unity. 

It frequently happens that some restrictions are imposed upon the 
number of permutations we are seeking. Whenever any restriction 
exists, it is important to consider the restricted groups first. The 
method is illustrated by the following: 


Example. How many six-place numbers can be found from the digits 
1, 2, 3, 4, 5, 6, if 3 and 4 are always to occupy the middle two places? 

The two digits, 3 and 4, can be arranged in 2 ! ways. The other four digits 
can be arranged in 4 ! ways. Hence in all 2 ! 4 ! =48 numbers. 


l n! = l*2*3 • • • nis called factorial n. 



NUMBER OF PERMUTATIONS 


367 


EXERCISES 

1. How many different numbers less than 1,000 can be formed from the 
digits 1, 2, 3, 4, 5, 6? 

2. Five persons enter a car in which 8 seats are vacant. In how many 
ways can they be seated? 

3 . In how many ways can 10 boys stand in a row when: 

(a) a given boy is at a given end? (b) a given boy is at an end? 
(c) two given boys are always together? (d) two given boys are never 
together? 

4 . In how many ways can 3 different algebras and 4 different geometries 
be arranged on a shelf so that the algebras are always together? 

6. Find the number of permutations, P, of the letters a a b b b taken 5 
at a time. Hint: P • 2 ! • 3 ! = 5 ! 

6. If P represents the number of distinct permutations of n things, taken 
all at a time, when, of the n things, there are n x alike, n 2 others alike, ns 
others alike, etc., then: 


ni!n 2 !n 3 ! . . . 

7 . How many distinct permutations can be made of the letters of the 
word attention taken all at a time? 

8. How many distinct permutations of the letters of the word Mississippi 
can be formed taking the letters all at a time? 

9. IIow many ways can ten balls be arranged in a line if 3 are white, 5 
are ^ed, and 2 are blue? 


92. COMBINATIONS 

A group of things or elements without reference to the order of 
the individuals in the group is called a combination . 

Thus, the combinations of abed taken 3 at a time are ab c, a b d, 
ac d,b c d. From each combination we can form 3 ! different permu- 
tations, and hence from the 4 combinations we can form (3 !) • 4 = 24 
permutations of 4 letters 3 at a time. 

A combination is frequently called a selection, whereas a permuta- 
tion is an arrangement. 

The number of combinations of n things taken r at a time is denoted 
by n C r , and is given by the formula: 

nC r = n jf ( 3 ) 

For r! permutations can be formed from each combination of r 
elements; and hence the total number of permutations must be r! 



368 


COMBINATIONS AND PROBABILITY 


times the number of combinations, n C r . That is r ! * n C r = n Pr from 
which (3) immediately follows. 

By applying (1): 

r - ~ l)(n - 2) - • • (n - r + 1) _ nl ( . 

nCr “ r! ~r!(n-r) ! W 

From (4) it follows immediately that: 

nPr ~ ifin—r ( 5 ) 

The binomial theorem, which is usually written in the form 

(a + b) n = a n + na 71 ” 1 ?) + — — a n_2 6 2 + • • • 
n(n — 1) • • • (n — r + 1) 


+ 


r ! 


d n ~ r b r + • • • + b n } 


may be conveniently written 

(a+b) n = a n + n Cia n - 1 b+ n C 2 a n - a b 2 + h n C r G n - r 6 r -f hb 71 (6) 

( 7 ) 


= 2 n n C r a n ~ r b T 

r= 0 


if we define n C 0 to be 1. 

We shall now illustrate these remarks with a few examples. 

Example 1. In how many ways can a committee of 9 be selected from 
12 people? 

This is evidently a problem of selection, not of arrangement, and the 
result is evidently: 

nC, = nC 3 = 12 1 ' 1 2 1 ' 3 1 ° = 220 


Example 2. From 6 men and 5 women, in how many ways can we select 
a group of 4 men and 3 women? 

a. We can select the 4 men from 6 men in 6^4 ways. 

b. We can select the 3 women from 5 women in 5 C 3 ways. 

By the fundamental principle we can do a. and b. in 6 C 4 • 5 C 3 = 150 
ways. 

Example 3. From 6 men and 5 women, how many committees of 8 each 
can be formed when the committee contains at least 3 women? 

The conditions of the problem are satisfied if the committee contains: 

a. 5 men and 3 women 

b. 4 men and 4 women 

c. 3 men and 6 women 



COMBINATIONS 


369 


Therefore the number of possible committees is 

*C 5 • 5C3 + fiA • 6^4 + 6 Ci • 5O 5 = 155 

It frequently happens that the problem involves both a selection and an 
arrangement with a limitation upon either. In such problems it is best to 
consider the two steps separately. A safe procedure is to deal first with the 
question of the selections (combinations) and then with the arrangements 
(permutations) . 

Example 4. How many line-ups are possible in choosing a baseball nine 
of 5 seniors and 4 juniors from a squad of 8 seniors and 7 juniors, if any 
man can be used in any position? 

The 5 seniors can be selected in S A ways, the 4 juniors in 7 C 4 ways. 
Hence the set of players can be selected in &C 5 • 7C4 ways. 

Any one set of 9 men can be arranged in 9 ! ways. Hence the total num- 
ber of possible line-ups is sA * 7 C 4 • 9!. 


EXERCISES 

1. Compute ioC 2 ; uA; 100 CW 

2. How many squads of 6 men each can be selected from a squad of 60 
men? 

3 . In how many ways can a committee of 3 teachers and 2 students be 
selected from 8 teachers and 15 students? 

4 . How many straight lines are determined from 10 points, no 3 of which 
are in the same straight line? 

5 . How many different sums can be made from a cent, a nickel, a dime, 
a quarter, a half-dollar, and a dollar? 

6. From 10 books, in how many ways can a selection of 6 be made: 
(a) when a specified book is always included? (b) when a specified book 
is always excluded? 

7 . Prove that n C r + n C r ~ 1 = n +i C r . 

8. Out of 6 different consonants and 4 different vowels, how many linear 
arrangements of letters, each containing 4 consonants and 3 vowels, can be 
formed? 

9 . A lodge has 50 members of whom 6 are physicians. In how many 
ways can a committee of 10 be chosen so as to contain at least 3 physicians? 

10 . In equation (6) make a = b = 1, and show that 

nCi + »C 2 + * ’ * + n C n = 2 n — 1 

11 . Solve Exercise 5 above, using Exercise 10. 

12 . In how many ways can 7 men stand in line so that 2 particular men 
will not be together? 

13 . A committee of 7 is to be chosen from 8 Englishmen and 5 Americans. 
In how many ways can a committee be chosen if it is to contain: (a) just 4 
Englishmen? (b) at least 4 Englishmen? 



370 


COMBINATIONS AND PROBABILITY 


14* Prove: n+ 2 ^r + 1 — rX^r- 1-1 “H 2 • rX^T "f" rXXr— 1 . 

15 . If n Pr =110 and „C r = 55, find n and r. 

16 . If n C 4 = n C 2 , find n. 

17 . If n C 3 = 10/21GA), find n. 

18 . If 2 nC n - 1 = 91/24( 2n _ 2 C n ), find n. 

19 . Prove: n Ci + 2 • n C 2 + 3 • n C 3 + • • • + n • n C n = n(2) n ~\ 

93. RELATIVE FREQUENCY: EMPIRICAL PROBABILITY 

A box contains 2 white and 3 black balls alike except in color. A 
ball is drawn at random, the color of it is noted, and then it is re- 
placed in the box. The drawing of the ball and replacing it is called 
a trial . Suppose we make 100 such drawings, mixing the balls 
thoroughly after each trial, and note that in this sample of 100 
drawings we have obtained 38 white and 62 black balls. Then we 
say 38/100 is the relative frequency of white balls and 62/100 is the 
relative frequency of black balls in this set of trials. Suppose that this 
experiment is repeated and that in the next 100 trials we obtain 
42 white balls and 58 black balls. In the second sample of 100 trials 
the relative frequency of white balls is 42/100 and that of the black 
balls is 58/100. If the results of the two sample sets are combined, 
we will then have obtained in the 200 drawings 80 white balls and 
120 black balls, and the resulting relative frequencies of white balls 
and black balls are 80/200 = 2/5 and 120/200 = 3/5 respectively. 

In performing experiments of the type described in the preceding 
paragraph the happening of the event in question is frequently called 
a success , and the nonhappening of the event a failure. In the experi- 
ments described the drawing of a white ball may be counted a success 
and that of the black ball a failure. It may be noted that the sum 
of the relative frequencies of white balls and black balls in every 
sample drawing is equal to unity. In general if we make s + / = n 
trials resulting in s successes and / failures we say that: 

s 

- = the relative frequency of the successes 

and f 

^ = the relative frequency of the failures 

The sum of the relative frequencies of successes and of failures in 
any set of trials is equal to: 

i + / = a + / a ! sl 

n n n n 



RELATIVE FREQUENCY 371 

The fraction s/n, which we have called the relative frequency of 
successes in n trials, may be considered an approximate probability 
derived from observation. If n is large, then, until further knowledge 
is obtained, s/n may be taken as a good estimate of the probability 
of success in a given trial. Our confidence in this estimate increases 
as the number, n, of observed cases increases. If, as n increases 
indefinitely, the ratio s/n approaches a limiting value, p, this limiting 
value is called the probability of a success in one trial . Hence: 

lim s 

p = - 

^ n— >oo n 

Thus, if we continue indefinitely the drawing of a ball from a box 
2/5 of the contents of which are white balls, we may assume that 
the relative frequency of white balls would approach 2/5, and we say 
2/5 is the probability of obtaining a white ball in a single trial. 

The probability that we have thus far discussed as coming from 
observation is frequently called empirical probability. 

The empirical method of determining probability is widely used 
in statistics, pension systems, life insurance, fire insurance, etc. In 
using the experimental method we shall simply idealize actual ex- 
perience and assume that the limit of s/n exists, and that, if n is 
large, s/n is a good estimate of the limit. 

EXERCISES 

1. In a certain experiment of coin-tossing heads appeared 2,048 times in 
4,040 throws. What is the relative frequency of heads? of tails? 

2 . In an experiment in coin tossing, 7 dimes were thrown 128 times with 
the following results : 


Number of 
Heads 

X 

Number of Times 

X Heads Appeared 

fix) 

0 

2 

1 

8 

2 

16 

3 

38 

4 

43 

5 

16 

6 

2 

7 

3 

Total 

128 



372 


COMBINATIONS AND PROBABILITY 


Find the relative frequency of 0 heads; 1 head; etc. 

Compute M and a for this distribution. 

3 . Among 10,000 people aged 30, 85 deaths occurred in a year. What 
was the relative frequency of deaths? 

4 . Out of 1,000 children born in a city in a given year, 514 were boys 
and 486 were girls. What is the relative frequency of boys among the chil- 
dren that year? 

5. As a cooperative exercise for the class, make 1,000 tosses of a coin 
and keep a record of the number of heads in (a) 10 trials, (b) 100 trials, 
(c) 250 trials, (d) 1,000 trials. In each case compare the observed relative 
frequency with the expected relative frequency, 1/2. 


94. THEORETICAL RELATIVE FREQUENCY: 

A PRIORI PROBABILITY 

In certain cases, such as games of chance or drawing balls from a 
bag, the probability may be obtained without collecting statistical 
data on frequencies. In these cases we make use of certain assump- 
tions that will give us the probability without actually making the 
trials. For example, if a coin is tossed we assume that it is so con- 
structed and tossed that tails are just as likely to come up as heads, 
and hence: 

the probability of heads = the probability of tails = | 

Similarly, if a bag contains 4 white balls and 6 black balls alike 
except as to color, and thoroughly mixed, and a ball is drawn at 
random, the probability of drawing a white ball is 4/10 and the prob- 
ability of drawing a black ball is 6/10. These illustrations are simple 
applications of the following : 

Definition. If all the successes and failures can be analyzed into 
$ + f possible ways , each of which is equally likely , and if s of these 
ways give successes and f of them failures y the probability of success in a 
single trial is defined as p = s/(s + /) and the probability of failure is 
defined as q = f/(s + /). 

Example 1. A bag contains 8 black balls and 3 white balls, and a ball is 
drawn at random. What is the probability of drawing a white ball? a 
black ball? 

If the probability of drawing a white ball is counted a success, we have 
8 = 3, / = 8, s + / = 11, and hence p - and q = -ft. 



THEORETICAL RELATIVE FREQUENCY 373 

Returning to the foregoing definition, we may note that if p = 0, 
the event in question cannot happen or is impossible. If p = 1, the 
event is certain to happen. 

Example 2. From a bag containing 8 white balls and 3 black balls, 5 
balls are drawn at random. What is the probability that 3 are white and 
2 are black? 

The total number of balls in the bag is II. Hence the number of ways of 
selecting 5 balls from 11 balls is nCs. The 3 white balls can be selected from 
8 white balls in 8 C 3 ways, and the 2 black balls can be selected from 3 black 
balls in 3 C 2 ways. Hence s = 8 C 3 • 3 C 2 , and the probability of drawing 3 
white and 2 black balls is: 

_ 8^3 * 3^2 _ 4 

V ~ 11 C t ~ 11 

Example 3. If 5 coins are tossed, what is the probability of obtaining 
2 heads and 3 tails? 

Five coins may fall in 2 6 = 32 ways. Two heads may be selected from 
the 5 in 6 C 2 = 10 ways. Hence the probability is 


EXERCISES 

1. If a die is thrown, what is the probability that a six will appear? 
that either a five or a six will appear? that a four, five, or six will appear? 

2. If 2 dice are thrown, what is the probability of obtaining a double 
six? 

3. If 2 dice are thrown, what is the probability of obtaining a sum of 
11? a sum of 7? What is the most probable sum in a throw of 2 dice? 

4 . A deck of 52 cards is well shuffled and a card is drawn. What is the 
probability that it is a queen? an ace or a queen? a heart? a red card? 

6. What is the chance of throwing one and only one five in one throw 
with 2 dice? 

6. If 2 dice are thrown, what is the chance of throwing at least one five? 

7. If 2 coins are tossed, what is the probability of obtaining 2 heads? 

2 tails? 1 head and 1 tail? 

8. If 3 coins are tossed, what is the probability of getting 3 heads? 

3 tails? 2 heads and 1 tail? 

9. What is more likely to happen, a throw of four with 1 die or a throw 
of six with 2 dice? 

10 . What is the probability of throwing 2 sixes and 1 five in a single 
throw with 3 dice? 

11 . If 12 men stand in line, what is the chance that A and B are next to 
each other? 



374 


COMBINATIONS AND PROBABILITY 


12 . From a pack of 52 cards, 3 cards are drawn at random, 
the chance that they are all clubs? 

13 . Prove: 


2nCn+r+l — 2 nCn+r 



What is 


95. EXPECTATION 

The expected number of occurrences of an event in n trials is defined 
as np where p is the probability of occurrence of the event in a single 
trial. 

Thus, if 100 coins are thrown or if 1 coin is thrown 100 times, 
theoretically we expect 50 heads and 50 tails, for n = 100 and 
V = Q = i 

If a die is rolled 36 times, theoretically we expect an ace to turn 
up 6 times, for n = 36 and p - 

If .008 is the probability of death within a year of a man aged 30, 
the expected number of deaths within a year among 10,000 men of 
this age would be 80, for n = 10,000 and p = .008. 

Question: Two coins are thrown 100 times. What is the expected num- 
ber of 2 heads? 2 tails? 1 head and 1 tail? 

If p is the probability that a person will win a sum of money m, 
we define his expectation as pm. 

Thus, if a person is to receive $32 in case he tosses 4 coins and they all 
fall heads, the value of his expectation is $2, for m = $32 and p = -fa. 

Question : A stake of $24 is made contingent upon getting a sum greater 
than 10 in a single throw with 2 dice. What is the value of the expectation? 


96. SOME ELEMENTARY THEOREMS 

A. Mutually Exclusive Events. Two or more events are said to be 
mutually exclusive when the occurrence of any one of them excludes 
the occurrence of any other. Thus, in the toss of a coin the appearance 
of heads and the appearance of tails are mutually exclusive. Also, 
if a bag contains white and black balls and a ball is drawn, the 
drawing of a white ball and the drawing of a black ball are mutually 
exclusive events. 



SOME ELEMENTARY THEOREMS 


375 


Theorem, If pi } p 2 > . . p r are the separate probabilities of r 
mutually exclusive events y the probability P, that one of these events will 
happen in a single trial is the sum of the probabilities of the separate 
events. That is: 

P = Pi + p2 + • • * + pr ( 8 ) 

By the definition in the preceding section, out of n trials in which 
all of the events are in question, the r events are expected to occur 
pin, p 2 n, . . ., p r n times respectively. Since only one of these 
events can occur on a given trial, it follows that out of n trials one 
or another of the r events will occur ( p x n + p 2 n + • • • + p r n ) or 
(pi + Pi + • * • + Pr)n times. That is, the total probability that 
one of the events will occur on a given trial is: 

p ~ ^ + ** + - • - + p-fr = P , + p, + . . . + v , 

n 

When two mutually exclusive events are in question, the proba- 
bilities are frequently called either or probabilities. Thus, if a die is 
thrown, the probability of either an ace or a deuce is £ + £ or 

B. Independent Events. Two or more events are dependent or 
independent according as the occurrence of any one of them does or 
does not affect the occurrence of the others. Thus, if A tosses a 
coin and B throws a die, the tossing of heads by A and the throwing 
of a deuce by B are independent events. However, if a bag contains 
a mixture of white and black balls and a ball is drawn and not re- 
turned to the bag, the probabilities in a second drawing will be 
dependent upon the first event. 

Theorem. If pi } p 2 , . . . , p T are the separate probabilities of r 
independent events , the probability P, that they all occur on a given 
trial when all of them are in question , is the product of their separate 
probabilities . That is: 

P = plp2p3 • • • Pr ( 9 ) 

By the definition of the preceding section, out of n trials in which 
all of the events are in question, the first event is expected to occul 
Pin times. Out of this number, pin, the second event is expected to 
occur p2(pift) = npip2 times. That is, both are expected to occur 
npip 2 times in n trials. Continuing this process, it is seen that out 



376 


COMBINATIONS AND PROBABILITY 


of n trials all of the r events are expected to occur npip 2 pz . . . p T 
times. Hence: 

p = -t-tJL- £- = p lPi p S . . . Pr 

ft 

Example 1. If A tosses a coin and B throws a die, what is the probability 
that A will toss heads and B will throw a deuce? 

The probability that A will toss heads is \ and the probability that B 
will throw a deuce is Since the two events are independent, the proba- 
bility that both events will occur is i • £ = A. 

Example 2. If a coin is tossed 3 times, what is the probability of heads 
every time? 

The probability of heads on any throw is Hence for the 3 throws, 
since they are independent, P = % • % • \ 

When two independent events are in question, the probabilities are fre- 
quently called both and probabilities. Thus in Example 1 if the tossing of 
heads by A is event E\ and the throwing a deuce by B is event E 2 , then the 
probability that both E\ and E 2 occur is T V- 

In Example 1, what is the probability that either A will toss heads or B 
will throw a deuce? 


Corollary. If pu P2, • • •, pr we the separate probabilities of r 
independent events } the probability that they will all fail on a given • 
occasion is 

(1 - *l)(l (1 - Pr) (10) 

and the probability that the first k events will occur and the remainder 
fail is. pi - Pi . . . pk{ 1 - Pk+i) • • • (1 - Pr) ( 11 ) 


C. Dependent Events. The following theorem for dependent 
events may be proved by an analogous method to that used for 
independent events. 


Theorem. If the probability of a first of r events is pi, and if, after 
this has occurred, the probability of a second event is pi, and if, after 
the first and second events have occurred, the probability of a third event 
is P 3, and so on, then the probability P, that the events will occur in the 
order specified is: 


P = pipipz 


( 12 ) 


EXERCISES 

1. If 5 balls are drawn from a bag containing 6 red and 9 white balls, 
what is the probability: (a) that all will be red? (b) that 3 will be red and 
2 white? 



SOME ELEMENTARY THEOREMS 


377 


2 . A draws 3 cards from a well-shuffled pack and simultaneously B 
tosses a coin. What is the probability of 3 aces and 1 head? 

3. If 4 coins are tossed, what is the probability that all will fall tails? 

4 . A, B, and C go bird-hunting. A has a record of 1 bird out of 2, B gets 
2 out of 3, and C gets 3 out of 4. What is the probability that they will 
kill a bird at which all shoot simultaneously? Hint: What is the proba- 
bility that all 3 miss? 

6. If the probability that A will die within a year is and the proba- 
bility that B will die within a year is what is the probability that: 
(a) both A and B will die within a year? (b) both A and B will live a year? 
(c) one life will fail within a year? 

6. The probability that A will solve a problem is \ and that B will solve 
it is § . What is the probability that if A and B try the problem it will be 
solved? 

7 . In a single throw of 2 dice what is the chance that neither doublets 
nor seven will appear? 


97. REPEATED TRIALS 

As we proceed into the text the observing student will be amazed 
at the importance of the theory of the probability of repeated trials 
in the theory of statistics. This is, of course, due primarily to the 
fact that much of statistical data is a kind of repeated measurement. 

In order to familiarize ourselves with the method of proof of the 
general theorem of this section, let us consider something simple. 

Example. What is the probability of throwing 2 aces in 4 throws of a die? 

The conditions of the problem are met if in the first 2 throws we obtain 
aces and in the next 2 throws not-aces; or if in the first throw we get ace, 
the second throw not-ace, the third throw ace, and the fourth throw not- 
ace; and so on. We shall illustrate the possibilities symbolically as follows: 

A1A2 — , Ai — A3 — , Ay Ai, — A2A3 — , — A2 — A4, A3A4 

Considering the first case, the probability of throwing an ace on any 
throw is The probability of not throwing an ace on any throw is |. 
Hence the probability of throwing an ace on the first and second throws 
and not throwing an ace on the two remaining throws is (J) 2 (t)* 2 - 

In the second case, the probability of events occurring as the symbol 
above indicates is (i) (|)(i)(f ) = (£) 2 (f) 2 . 

The remaining cases may be treated in a similar manner, and in each 
instance the result for any specified set is (i) 2 (f) 2 . Now it is evident that 
the 2 aces can be selected from the 4 possible aces in 4C2 = 6 ways. Since 
the 6 cases are mutually exclusive, the chance that one or the other of the 
specified cases occurs is 6(i) 2 (f) 2 = 



378 


COMBINATIONS AND PROBABILITY 


Let us now consider an important theorem. 

Theorem of Repeated Trials. If p is the probability of the success 
of an event in a single trial and q is the probability of its failure , 
(J> + q = 1 ), then the probability P T that the event will succeed exactly 
r times in n trials is : 1 

Pr = nC r p r q n - T (13) 

For the probability that the event will succeed in each of r specified 
trials and will fail in the remaining (n — r) trials is, by (11), p r q n ~ r . 
Further, it is possible for the r successes to occur out of n trials in 
nC r different ways. These ways being mutually exclusive, by (8) the 
probability in question is P r = n C r p r q n ~ ~ r . 

The various probabilities are indicated in the following table : 


Table 87. Values of P r for Various Values of r 



Pr 

The Probability That in n Trials 



There Will Be 

n 


n 

successes, 

0 failures 

n — 1 

„Cip n ~ l q 

n — 

i “ , 

1 

n — 2 

„c 2 p n -y 

n- 

2 “ , 

2 



’ 

n — r 

nC T p n ~ r q r 

n — 

r successes 

u 

r failures 

u 

r 

n CrP r q n - r 

r 

u 

(( 

n — r “ 

u 

2 

nCtpY-* 

2 

(l 

n - 2 “ 

1 

„CiP5 n_1 

1 

({ 

71-1 “ 

0 

q n 

0 

ll 

n “ 

Total 

(p + q) n = i 



From Table 87 we have at once the following: 

Corollary. The probability that an event will succeed at least r times 
in n trials is P r + P r + i + • • • + P n) that is: 

ip T = p n + nCip n ^q + „C 2 j!>*- V + • • • + n C r p r q n - T (14) 

T 

It will be noted that (14) consists of the first (n — r + 1) terms of 
the expansion (p + q) n . 

1 It will be noted that (13) is the (n — r + l)th term of the expansion (p + q) n 
and the (r + l)th term .of the expansion ( q + p) n . 



REPEATED TRIALS 


379 


Example. An urn contains 12 white and 24 black balls. What is the 
probability that, in 10 drawings with replacements, exactly 6 white balls 
are drawn? 

We have: 12 1 24 2 

V ~ 36 “ 3 q ~ 36 - 3’ 

n = 10, r = 6 n — r = 4, 

hence: „ _ /IV /2 V 3360 

Pi = loCe (3) w = -w 


Since the computation of P r in (13) involves the computation of 

71 ! 

n C r = > we may naturally wonder what can be done when 


n and r are so large that the labor of evaluating n !, r !, and (n — r) ! 
becomes tedious if not prohibitive. At present we can recommend 
two alternatives. If tables of the logarithms of factorial n are at 
hand, 1 then P r can be conveniently computed by logarithms. If 
such tables are not at hand, approximate results can be found by 
applying Stirling's formula, namely: 

n! = e~ n n n V2TT7i (15) 


The derivation of this formula depends upon the calculus and is 
therefore beyond the scope of this text. 2 For large values of n, it 
gives satisfactory results. 

Consider the following: 

Example. An urn contains 2 white and 3 black balls. What is the 
probability that, in 500 drawings with replacements, exactly 200 white 
balls will be drawn? 


Solution: 


n = 500, 



7 


3 

5 


P 200 — 600 Cii 


2V 00 

57 


500! 

200 ! 300 ! 



'2joo03oo 


1 An excellent set of tables is J. W. Glover, Tables of Applied Mathematics , 
George Wahr, Ann Arbor, Michigan, 1923. 

8 See J. L. Coolidge, An Introduction to Mathematical Probability } p. 38, for a 
derivation. 



380 


COMBINATIONS AND PROBABILITY 


Applying Stirling’s formula: 

_ 500 600 e -600 V2tt • 500 

-1200 


’2\ 200 /SX 300 


200 200 e“ 200 V2tt • 200 300 300 e -300 V 27r • 300 \5, 
hOO 600 ^ 


200 200 300 300 10Vl27 r\5, 

55001 go 500 V52 200 3 300 
2200 10O 200 3 300 100 300 10 x/lSbr 5 600 
V5 

= = .036. 

10V12t r 

If Glover’s Tables are used with logarithms the result is: 

P 200 = .041 


EXERCISES 

1. A coin is tossed 7 times, or 7 coins are tossed one time. Find the 
probability of exactly: (a) no heads, (b) 1 head, (c) 2 heads, etc. to 7 heads. 

2 . Seven coins are tossed 128 times. Using the Definition in Section 95 
(p. 374), and the probabilities of the last exercise (1), find the expected 
number of times of 0 heads, 1 head, 2 heads, etc. to 7 heads. Compare the 
results with those of Exercise 2 (p. 371). 

3. If a die is thrown 6 times or if 6 dice are thrown 1 time, what is the 
probability of obtaining: (a) exactly 2 aces? (b) at least 3 aces? 

4 . Find the probability of throwing with a single die a deuce at least 
once in 5 trials. 

6. Prove that the probability that an event will succeed at least once 
in n trials is (1 — g n ). 

6. In tossing 10 coins, what is the probability of obtaining at least 
8 heads? 

7 . A man whose batting average is will bat 4 times in a game. What 
is the probability that he will get (a) exactly 2 hits? (b) at least 2 hits? 

8. According to the American Experience Table of Mortality, out of 
100,000 persons living at the age of 10 years, 91,914 are living at the age 
of 21 years. Each of 7 boys is now 10 years old. What is the probability 
that exactly 5 of them will live to be 21? 

9. A bag contains 4 white and 2 black balls. Five balls are drawn with 
replacements. What is the probability: (a) that exactly 3 are white? 
(b) that at least 3 are black? 

10 . What is the probability of throwing at least 3 sevens in 5 throws 
with a pair of dice? 

11 . How many throws with 2 dice will be required in order that the 
probability of obtaining a double six at least once will have the value §? 

Hint: If J - 1 - («)» find n. 



REPEATED TRIALS 


381 


12. At an old men’s home are 5 seventy-year old men. Find the proba- 
bility that (a) exactly 2 of them will die within a year, (b) that a specified 

2 of them will die within a year, (c) that at least 2 of them will die within 
a year. The probability that a man aged 70 lives a year is p 7 o = 0.94. 
Hence <770 = 0.06. 

13. Hospital records show that 5 per cent of cases of a certain disease 
are fatal. Five patients are admitted with this disease. Find the proba- 
bility (a) that all will recover, (b) that exactly 3 will die, (c) that at least 

3 will die. 

14. A marksman is able, on the average, to hit a target 950 times out 
of 1,000. Find the probability that he will obtain (a) exactly 9 hits in 
10 shots, (b) exactly 10 hits in 10 shots, (c) either 9 or 10 hits in 10 shots, 
(d) at least 5 hits in 10 shots. Express symbolically. 

16. The registrar’s records show that 10 per cent of the students fail 
a certain course. The present enrolment in the course is 25. What is the 
probability that 5 will fail? 

16. In the long run 3 vessels out of every 100 are sunk. If 10 vessels 
are out, what is the probability (a) that exactly 6 will arrive safely? 
(b) that at least 6 will arrive safely? Express symbolically. 

17. A batch of 1,000 electric bulbs was tested and found to be 5 per 
cent bad. If another batch of 100 lamps is manufactured under similar 
conditions, what is the probability that not more than 10 per cent will be 
defective? Give the result symbolically. 

18. The American Experience Mortality Table states that for an indi- 
vidual aged 25 the probability of survival another year is p = 0.992. 
What probabilities are expressed by the following: 

900 

<a) ioooC 2 oo(.992) 800 (.008) 200 ? (b) 2 10 ooC r (.992) 100 °--(.008) r ? 

r = 700 

19. A, B , and C are three marksmen. A’ s record is 4 hits in 5 shots, 
B ' s record is 3 hits in 4 shots, and C’s record is 2 hits in 3 shots. They fire 
simultaneously. What is the probability that at least 2 shots hit? 

20. Of 7 dates picked at random, what is the probability that (a) exactly 
5 are Sundays, (b) at least 5 are Sundays, (c) the first 5 but no others are 
Sundays? 

21. A can hit a target 4 times in 5 shots; B , three times in four shots. 
They fire a volley. What is the probability (a) that at least two shots hit? 

(b) that at least one shot hits? 

22. A student takes a true-false test consisting of 10 questions and 
guesses at the answers. Assuming he is equally likely to answer incorrectly 
as correctly on each question, find the probability (a) that he will answer 
all the questions correctly, (b) that he will answer half of them correctly, 

(c) that he will answer 80 per cent or more of them correctly. 

23. In the long run a child under one year of age who is attacked by 
whooping cough has about a fifty-fifty chance of recovery. If 10 children 
under one year of age are attacked by this disease, 



382 


COMBINATIONS AND PROBABILITY 


(a) what is the expected number of deaths? 

(b) what is the probability that the expected number will die? 

(c) what is the probability that 8 or more recover? 

24. In how many throws with a single die will it be an even chance that 
“1” turns up at least once? 

26. If 2 dice are thrown, what is the probability of obtaining a total 
of 7? 

Hint: The number of ways of obtaining a total of 7 is the coefficient of 
x 7 in (x -f x 2 + £ 3 + £ 4 + x b + z 6 ) 2 , or of x b in (1 -f x + x 2 + + xl 

+ $ 6 ) 2 , or of x h in = (1 — x 6 ) 2 (l — x)~ 2 . 

26. If three dice are thrown what is the probability of obtaining a 
total of 10? 

Hint: The number of ways of obtaining a total of 10 is the coefficient 
of x 10 in (x + x 2 + x* + x 4 + x b + £ 6 ) 3 . 

27. If three dice are thrown what is the probability of obtaining a 
total of 8? 

28. Three dice are thrown. Show that the probability of obtaining a 
total of 4 is equal to the probability of obtaining a total of 17. 



Chapter 12 

THE POINT BINOMIAL AND THE NORMAL CURVE 
98. INTRODUCTION 

In the preceding chapter considerable emphasis was placed upon 
what is essentially the 

Theorem. If p is the probability of the success of an event in a 
single trial and q is the probability of its failure (p + q = 1), then the 
successive terms of the binomial expansion 

0 q+P) n = q n +nCiq n - 1 p+ n C 2 q n - 2 p 2 + • ■ * + n C x q n x p x + •••+/>" (1) 

give the respective probabilities that in n trials the event will succeed in 
0 , 1 , 2 , . . X y . . n times. 

It should be especially noted that the general term 

Px = nCxq n ~ x p x 

gives the probability that the event will succeed exactly X times in 
n trials. 

Figure 51 




384 


POINT BINOMIAL AND NORMAL CURVE 


Example 1. If a coin is tossed 10 times (or if 10 coins are tossed 1 time), 
the successive terms of the expansion 

(i+i) 10 =tc 1 2tC 1 + 10 + 45 + 120 + 210 + 252 + 210 + 120 + 45+10 + 1] 

give the probabilities of 0 heads; 1 head, 9 tails; 2 heads, 8 tails; etc. 

If the terms of this expansion (§ + ^) 10 be plotted as ordinates at unit 
distances along the horizontal axis, it will be noted that the points are 
symmetrically distributed about the vertical through X = 5 (Fig. 51). 
It will be shown later that the symmetry is due to the fact that p = q = £. 

Example 2. Nine balls are drawn singly, with replacements, from a bag 
containing white and black balls in the ratio of 2 to 1. If the probability of 
drawing a white ball is counted a success, p = f , q = + and the successive 
terms of the expansion 

($ + §) 9 = -rdnnr [1 + 18 + 144 + 672 + 2016 + 4032 + 5376 + 4608 

+ 2304 + 512] 

give the probabilities of drawing 0 white balls; 1 white, 8 black balls; 
2 white, 7 black balls, etc. 

If these probabilities be plotted as ordinates, as the figure below indicates, 
it is noted that the points are not symmetrically distributed. That the 
skewness here is due to the inequality of p and q will be shown in the 
succeeding section. 


Figure 52 



99. CHARACTERISTICS OF THE POINT BINOMIAL 

It has been observed in the preceding section that the binomial 
distributions possess certain geometrical similarities to the observed 



CHARACTERISTICS OF THE POINT BINOMIAL 385 


distributions studied in Chapters 3, 4, and 5; namely, they are 
relatively low at the extremes and rise to a single mode near the 
center. These similarities are so striking that we shall use the 
binomial distribution as a theoretical or approximate distribution 
with which to compare distributions of observed data. That is, 
we shall use the point binomial as the first approximation to dis- 
tributions of observed data. 

We shall need to characterize the binomial distribution, as we have 
other distributions, by computing measures of central tendency, 
dispersion, skewness, etcetera. Having computed these constants 
for the theoretical distribution, we shall apply the results to dis- 
tributions of observed data for purposes of comparison and general- 
ization. 

A. The Mode. Since the sum of the terms of (q + p) n equals 
unity and the extreme terms are usually smaller than those near the 
center, it would seem that for a determinate value of X, say X = 

Px = n Cxq n ~ x V x 

will have a maximum. 

In order for Px to be a maximum for X = a, we must have 

a. P a - 1 ^ P a 

b. P a Z Pa+1 

that is, we must have 

a. nCa-iqn-^V^ 1 ^ nC a q n ~ a P ot 

b. n C a q n ~ a p a ^ n Ca+\q n ~ a ~ l P aJrl 


Figure 63 




386 POINT BINOMIAL AND NORMAL CURVE 
Using the relation 

r - n! 

~ a ! (n — a) ! ’ 

the first inequality reduces to 

a ^ np + p = (n + l)p 
and the second reduces to: 

a ^np — q 

That is, a satisfies the double inequality: 

np-q^a^np+p ( 2 ) 


Diagram 13 



np — q 


np 


np + p 


If np + p is an integer, so is np — q the next lower integer. In 
this case two values of a satisfy (2) since a is necessarily integral. 
They are a = np + p and a = np — q. [See Exercise 6, p. 390.] Thus: 

(i + §) 3 = I + f + f + f 

has two equal terms which are larger than any other terms, one at 
a = f- + | = 2, and the other ata = f — f = l. Recalling that 
P a is the ( a 4- l)th term of (1), the second and the third terms are 
two equal terms which are larger than any other terms. Similarly, 

(f + §) 5 = + 10 -f 40 + 80 + 80 + 32] 

has two equal terms which are larger than any other terms, since 
np + p = 5(f) + f is an integer. They are the fourth and the fifth 
terms. 

If np + p or (n + l)p is fractional, so is np — q since np — q 
= np — (1 — p) = (n + l)p — 1. By (2), a must be the integer lying 
between them. Since there is only one such integer, it must be a. 
Thus (f + f ) 6 has only one maximum term. For in this case np + p 
= 6(f) + f = 4f , and np — q = 3f . Hence a = 4, and the fifth term 

- eC 4 (f) 2 (f) 4 = m 

is the maximum term. The entire expansion is: 

(i + f) 6 = rhll + 12 + 60 + 160 + 240 + 192 + 64] 



CHARACTERISTICS OF THE POINT BINOMIAL 387 

We may summarize these results into the following: 

Theorem. If np + p or np - q is fractional , P x has one maximum 
term for X equal to the greatest integer in np + p. If np + p or np — q 
is integral , Px has two equal terms which are larger than any other terms . 
They occur when X equals np + p and np - q. 

If n is large and np relatively large when compared with p and q> 
np closely approximates np + p and np - q. In this case we call 
np the expected number of successes and n — np = nil — p) = nq 
the expected number of failures of the event in n trials. 

The probability of np successes is given by P np = nC np p np q nq 
which, upon applying Stirling’s formula (p. 379), reduces to 

1 

F* np V 27 rnpq ’ 

a very small number. That is, the probability of obtaining the ex- 
pected number of successes (or failures) is a very improbable event. 

B. The Mean, the Dispersion, the Skewness. The computation 
of M , a, a! 3 , and a 4 is greatly facilitated by the preparation of Table 88 
in which f(X) = n C x q n ~ x p x indicates the ordinate corresponding to 
the given abscissa, X. 


Table 88 


X 

(1) 

AX) = n C x q»-XpX 
(2) 

Xf(X) 

(3) 

X{X-l)f{X) 

(4) 

X(X-l)(X-2 )f(X) 

(5) 

0 

Q n 

0 

0 

0 

1 

nq n ~ l p 

nq n ~ l p . 

0 

0 

2 


n(n — 1 )q n ~ 2 p 2 

n (n — 1 )q n ~*p* 

0 

3 

n(n-l)(n -2) 
1-2-3 

2 V _ |p , 

n(n — l)(n —2) 

— : qn~ 3 p3 

n(n — 1 )(n -2)g"“>p* 

n — 1 

nqp n ~ l 

n(n — 1 )q P n ~ l 

n (n — 1) (n — 2) qp n ~i 

7i (ti — 1)(ti —2) (7t —3 )qp n ~ l 

n 

p n 

np n 

n(n -l)p n 

n(n — 1) (71. —2 )p n 

Total 

(q +p)» 

np(q-\-p) n ~ l — np 

n(n — l)p 2 (g , 4-p) n_2 
= n(n — l)p 2 

n(n — 1) (71 — 2)p*(g-f-p) n ~* 
= 71(71 — 1) (71 —2 )p* 


The total of column (2) of the table, 2 f(X), is obviously unity since: 
2 f(X) = q n + nq n - l V + ? n_2 P 2 + • • • + V n = (? + p) B = 1 


The total of column (3), 2 Xf(X), is easily recognized if one takes the 
common factor, np, out of every term. Thus: 



388 


POINT BINOMIAL AND NORMAL CURVE 


2X/(X) =npjy~ 1 +(rc— l)q n ~*p+ — — q n ~*p 2 + • • •+p n-l J 

= np(q + p) n ~ l = np 

Likewise columns (4) and (5) may be reduced to: 

2X(X - l)f(X) = n(n - 1 )p\q + p)"“ 2 = n(n - l)p 2 
2X(X - 1)(X - 2)/(X) = n(n - 1 )(n - 2)p 3 (? + p)"~ 3 

= n(n — l)(n — 2)p 3 


We therefore have: 


XXf(X) _ np 

mx) i 


2X(X - 1)/(X) = 2X 2 /(X) - 2X/(X) = n(n - 1 )p* 
(from Table 88), we have: 

2X 2 /(X) = n(n - l)p 2 + 2X/(X) = n(n - l)p 2 + np 

— n 2 p 2 + np(l — p) — n 2 p 2 + npq - M 2 + npg 


Hence: 


, gm _ 

2/(X) 

„« . Ml ±a _ M. , 


Similarly: 

2X(X - 1)(X - 2)/(X) = 2X 3 /(X) - 32 X 2 /(X) + 22 X/(X) 

= 2X 3 /(X) — 3 (rc 2 p 2 + npg) + 2np 
= 2 X 3 /(X) - 3ra 2 p 2 - 3npg + 2np 
= n(n — l)(n — 2)p 3 (from Table 88) 

Therefore: 

2 X 3 /(X) = ra 3 p 3 + 3n 2 p 2 (l — p) + 3npg + 2 np 3 — 2np 
= n 3 p 3 + 3n 2 p 2 q + 3 npq + 2 np 3 — 2np 

Using the formula for v t given on page 162 for the case in which 
w = 1, h = 0, N = 2/(X) = 1, we have: 

v, = 2 X 3 /(X) - 32X 2 /(X)M + 2 
Substituting the values given above: 

Pi = 3npq + 2np 3 — 2np = np(3q + 2p 2 — 2) 

= np(3 — 3p + 2 p 2 — 2) = np(l — p)(l — 2p) 



CHARACTERISTICS OF THE POINT BINOMIAL 389 
and finally 

Vt or p 3 = npq( 1 — 2 p) — npq{q — p) 

Hence: 

Ms = npqjq - p) = q ~ p 
3 o’* ( npq)i <r 

Collecting these results we have 


M = np 
a = Vnpq 
_ g - 

O' 


as 


( 3 ) 


where the positive direction is that of increasing X. 

The equation M = np shows that for the point binomial, (q + p) n , 
the mean value is equal to the expected value. The value of 
shows that the skewness is positive when p is less than q , is negative 
when p is greater than q, and is zero when p equals q. 

In the next list of exercises we ask the student to show that when 
n becomes infinite in the point binomial (q + p) n , the skewness a 3 
approaches zero, and the kurtosis (a 4 — 3) also approaches zero. 
We have stated in Chapter 5 that, for a normal distribution , a 3 
equals zero and a 4 equals 3. Thus we see that as n increases the 
moments (of order less than 5) of the point binomial approach the 
same moments of the normal distribution. 


Values of a 3 and a 4 
for (.98 + .02) n 


n 

«3 

C*4 

100 

.68 

3.45 

200 

.48 

3.23 

300 

.40 

3.15 

400 

.34 

3.11 

500 

.31 

3.09 

600 

.28 

3.075 

700 

.26 

3.06 

800 

.24 

3.06 

900 

.23 

3.05 

1000 

.21 

3.045 


The rapidity with which a 3 approaches 
zero and a 4 approaches 3 as n increases, 
even for the case where p is extremely 
small, is shown by the accompanying 
table. 



390 


POINT BINOMIAL AND NORMAL CURVE 


EXERCISES 

1. Plot the histograms and the frequency polygons for the binomials 
following. Find for each binomial the M 0 , M, a, and a 3 . 

a. (* + f)‘ b. (f + *)‘ c. (* + *)« d. (f + f)< 

2 . By extending Table 88 show that: 

2X(X - 1)(X - 2) (X - 3 )f(X) = »(n - l)(n - 2)(n - 3)p 4 

3 . Using the value of p 4 given on page 162, show that for the point bi- 
nomial: 

= v\ = npgO + 3pg(rc — 2)] 

4 . Show that <x 4 for the point binomial is given by: 



Hint. (1) Let np — q = k, then np + p = k + 1 from which obtain 
(n — k)/(k + 1) = q/p. (2) Show that P np+p = Pk+i - (n — &)/(& + 1) 
* p/tf • Pfc. (3) Combine the results of (1) and (2). 

7 . Show that as n becomes infinite, a 3 equals zero and a 4 equals 3. 

8. Verify the values of the table for (.9 + .l) n . 



CHARACTERISTICS OF THE POINT BINOMIAL 391 


The following exercises are for students of the calculus. 

9 . Show that 

+ P*) n ] i=1 > (*■ = 0. 1. 2, 3) 

give the totals of columns 2, 3, 4, and 5 of Table 88. 

10 . Show that the moments of (q + p) n given by 

d { 1 

M. = + 9 <r '“) n l 0 ’ (» = 2 * 3 > 4 > 

are the same as we have given in the text. This relationship was given 
by Karl Pearson in Biometrika , Yol. XII, p. 270. 

11. The moments of ( q + p) n can be obtained from 

f . dp » 

Mt+i = vq |^/A-i - 

recalling that juo = 1 and pi = 0. Use this relation 1 to establish the values 
given in the text. 


100. THE POINT BINOMIAL APPLIED TO FREQUENCY 
DISTRIBUTIONS 

It should be emphasized that the terms of (1) represent probabilities 
and that their sum is unity. By Section 95 (p. 374), if the terms of 
(1) are multiplied by some suitable number, the several terms will 
then represent frequencies. Thus, if 10 coins are thrown 1,024 times, 
the terms of the expansion 

1,024(| + i) 10 = 1 + 10 + 45 + • • • + 252 + • • * + 10 + 1 

represent the expected number of times that we should obtain 0, 1, 2, 
. . . , 5, . . . , 9, 10 heads, that is, 

expected frequency of X = (1024)i oCx(i)™~ x (i) x 

An experiment in which 10 coins were thrown 1,024 times was 
performed, and the actual results together with the theoretical or 
expected results are shown in Table 89. 

1 See article by A. T. Craig, Bulletin of the American Mathematical Society , 
Yol. 40, p. 262. 



POINT BINOMIAL AND NORMAL CURVE 


39a 


Table 89. Actual and Expected Results 
in Tossing 10 Coins 1,024 Times 


Number 
Heads Up 

X 

Actual 

Frequency 

f(X) 

Expected 

Frequency 

f '( X ) 

0 

2 

1 

1 

10 

10 

2 

40 

45 

3 

116 

120 

4 

205 

210 

5 

257 

252 

6 

216 

210 

7 

126 

120 

8 

42 

45 

9 

8 

10 

10 

2 

1 

Total 

1,024 

1,024 


Figure 54 



THE POINT BINOMIAL 


393 


When the data of Table 89 are plotted as in the Figure 54, and 
the frequency polygons are drawn, the differences between the 6b\ 
served and the expected frequencies are seen to be slight. These 
differences may be the result of many causes, such as the lack of 
homogeneity of the coins, the faulty methods of tossing them, and 
what are usually known as variations due to chance. 

The statistical constants for the observed and the theoretical 
distributions of Table 89 are given in Table 90. The constants for 
the theoretical distribution were computed by (3) and Exercise 4 
on page 390, whereas those for the distribution of observed values 
were computed by the methods of Section 44 (p. 164). 


Table 90 



Distribution of 
Theoretical Values 

Distribution of 
Observed Values 

M 

5.0000 

5.0283 

(T 

1.581 

1.567 

OLz 

0.0000 

- 0.0499 

a 4 

2.8000 

2.9246 


In a similar manner any distribution of observed values can be 
more or less approximately reproduced by multiplying the terms of 
the expansion of (q + p) n by the total frequency N. If the distribu- 
tion is nearly symmetrical, we take p = q = % and n such a number 
that the (n + 1) terms of the expansion when multiplied by N will 
give (n + 1) theoretical frequencies. 

Thus, let us consider the following distribution of the heights of 
750 college men. Since distributions of heights of men are known 
to be closely symmetrical, we choose p — q = Also, since there 
are 14 classes of heights ranging from 61 inches to 74 inches inclusive, 
we choose n = 13. Hence the terms of the expansion 750(i + i) 13 
give 14 theoretical frequencies. The following table exhibits the 
frequency distributions of theoretical and observed values. The 
theoretical frequency, for a given X, is 750 i 3 C x (i) lz ~ x (i) x - 

Exercise. Compute values of M, <x, a 3 , and a 4 for the two distributions 
of Table 91 and thus make a comparison of their moments. 



POINT BINOMIAL AND NORMAL CURVE 


m 


Table 91. Observed and Binomial Frequencies 
of the Heights of 750 College Men 


Height 

X 

Observed 

f(X) 

Binomial 

f{X) 

61 

0 

2 

0 

62 

1 

4 

1 

63 

2 

10 

7 

64 

3 

32 

26 

65 

4 

63 

66* 

66 

5 

103 

118 

67 

6 

146 

157 

68 

7 

143 

157 

69 

8 

111 

118 

70 

9 

75 

66* 

71 

10 

35 

26 

72 

11 

12 

7 

73 

12 

3 

1 

74 

13 

1 

0 


Total 

750 

750 


* This value was 65.5. 


Comparing the observed with the theoretical frequency it is of 
course noted that, for a given value of X } the observed frequency 
differs from the theoretical frequency. Even the most scrupulous 
among us are not surprised at these differences. However, the student 
may properly inquire as to just how large such differences may be. 
This is one of the fundamental questions to which we shall give 
attention in Chapter 13 when we consider the problem of sampling. 

For a given N and n, the theoretical distribution N(q + p) n 
obviously depends upon the value of p or q. The value of p may be 
determined a priori as in dice-throwing or coin-tossing experiments, 
or it may be determined empirically from experiment or observation 
as in the probabilities of life and death. When p is determined em- 
pirically, it is influenced by sampling errors. Other samples of the 
same size chosen from the same universe will not yield the same 
values of p, and consequently the goodness of the theoretical distribu- 
tion N(q + p) n for graduation purposes will depend upon the ac- 
curacy of p. 



THE POINT BINOMIAL 


395 


The binomial distribution (q + p) n was the first theoretical dis- 
tribution to be established. It was first discussed in Ars Conjectandi 
(published posthumously in 1713) by James Bernoulli and thus any 
discrete distribution with frequencies proportional to the terms of 
the expansion is frequently called a Bernoulli Distribution. In fact, 
what we have called the Repeated Trials Theorem is frequently called 
The Bernoulli Theorem . 


EXERCISES 

1. Table A below gives the I.Q.’s of 905 school children. Table B 
gives the weights of 1000 school children. Graduate Table A by the 
expansion 905(| + D 8 and Table B by 1000Q + D 9 * 


A 

Table 


X 

f(x) 

60.5 

3 

70.5 

21 

80.5 

78 

90.5 

182 

100.5 

305 

110.5 

209 

120.5 

81 

130.5 

21 

140.5 

5 

Total 

905 


M = 100.95 
<r = 13.0 


B 

Table 


X 

m 

29.5 

1 

33.5 

14 

37.5 

56 

41.5 

172 

45.5 

245 

49*5 

263 

53.5 

156 

57.5 

67 

61.5 

23 

65.5 

3 

Total 

1000 


M = 47.71 pounds 
<t - 5.88 pounds 


101. THE NORMAL CURVE: INTRODUCTORY REMARKS 

In preceding chapters we have described frequency distributions 
by three methods: the graphical method, the method of moments, 
and the point binomial. The graphical method is a mere pictorial 
representation of the tabulated data and is inadequate statistically 
because it is only a picture. The method of moments is a refined 
method which is adequate for many purposes, especially for purposes 
of comparison, when M, a, a 3 , and a 4 are computed. The binomial 



396 


POINT BINOMIAL AND NORMAL CURVE 


distribution is still a step forward. It gives us an equation for writing 
down the theoretical frequency for a given integral value of X , 
and the estimated sum of such frequencies between certain specified 
limits. Thus, theoretically at least, the point binomial provides all 
the advantages that accrue from an equation. 

Practically, the point binomial is unsatisfactory for two important 
reasons. First, it is a discontinuous function, being strictly defined 
only for integral values of X. Second, when n is large, its use in 
answering many questions entails so much labor as to render it 
unfit for practical usage. We seek, therefore, a continuous function 
having approximately the same ordinates as the binomial series and 
which is so well tabulated that important questions in probability 
can be answered by its use without the tedium of undue labor. The 
simplest continuous function that meets our needs is the normal or 
Gaussian function, whose general equation is: 

y = Ce~ h9x2 

Here e is the base of the natural or Napierian system of logarithms 
whose value is 2.71828. . .. The constant C determines the maxi- 
mum height of the curve and the constant h its spread. 

As was stated in Section 89, the normal or Gaussian curve was 
first established by De Moivre. A proof was also given by Laplace 
at a later date and hence the curve is sometimes called the Laplacean 
curve. Gauss approved the law, used it, and gave an original proof 
of it. Thus, the normal law began its early life with a rare hereditary 
background. No wonder the lesser lights of the first half of the nine- 
teenth century claimed for it a value that was undeserved, con- 
sidered it to be “the ideal curve/ 7 and demanded an explanation if a 
distribution did not obey it. 

The writers in the latter half of the nineteenth century seem to 
have been more careful that their enthusiasms did not outrun the 
facts, for as data from many fields accumulated it became general 
knowledge that the normal curve is but one of a number of types 
of curves which are used to describe frequency distributions. So 
we must not assume that a non-normal distribution is “abnormal 77 
in the usual sense of the word. 

The normal curve, however, is by far the most important type; 
further, its importance seems to have increased within recent years, 



THE NORMAL CURVE 


397 


and the history of the theory of statistics may date from its discovery 
by De Moivre in 1733. There are good reasons why this is so. 

First, it is a continuous function. 

Second, the normal curve lends itself well to mathematical treatment. 
That is, it possesses properties that are mathematically elegant, com- 
paratively simple to derive, and expressible in simple forms. 

Third, a large number of distributions, mound-shaped in appearance, are 
approximately of the normal form and may be subjected to normal curve 
analysis as a first approximation. 

Fourth, many sampling distributions, such as distributions of means, 
distributions of standard deviations, and others are of the normal form 
exactly or to a satisfactory degree of approximation. Thus, the formulas 
for determining the reliability of a statistical function “lean heavily upon 
this law.” 

Fifth, of two well-known systems of generalized frequency curves, one 
of them, that developed by Gram, Thiele, Charlier (known as the Scandi- 
navian school), is based upon the normal curve as a generating function. 

A development of the theory of generalized frequency functions, 
though an important and attractive study, is so severe in the mathe- 
matical background required to comprehend it that its inclusion 
in our elementary study would seem inappropriate. However, a 
derivation of the normal curve and a study of its properties are so 
essential to the study of elementary statistical analysis that their 
inclusion in our text seems mandatory. 

102. DERIVATION OF THE EQUATION 
TO THE NORMAL CURVE 

Figure 51 (p. 383) shows the frequency polygon for the point 
binomial + ^) 10 . The eleven points are symmetrically distributed 
about the vertical line through X = M = up = 5. In like manner if 
(2 + 2 )” be plotted for any n, the points will be symmetrically dis- 
tributed about the vertical line through X = M = np = n/2 since 
p = q and a 3 = 0. Now if n be allowed to increase indefinitely the 
polygon of (yi -f~ 1) vertices and (n + 2) sides will approach a smooth 
curve, 1 the normal curve, symmetrical to the vertical line through 

1 As n increases, it becomes necessary to reduce the X-scale to keep the dia- 
gram within reasonable dimensions. We are interested in confining the range 
to an interval of three or four standard deviations from the mean. Consequently, 
we assume that n increases and (Ax) decreases in such a way that n(Ax) 2 always 
equals a constant 2 <r 2 . 



398 


POINT BINOMIAL AND NORMAL CURVE 


X =* M. In other words, the normal curve is the limit of the point 
binomial (| + |) n as n becomes infinite. 

The proof of the statement above is facilitated by assuming that 
n is even and by employing the 

Lemma . If the several terms of the expansion (J + £) 2n be plotted 
as ordinates at intervals of AX along the X-axis, 2 nCo/2 2n being taken 
at the origin, so that the abscissas of 2nC\/2 2n , 2nCV2 2n , . . . , 2nC n /2 2n , 

. . . , 2 nC2n/2 2n are AX, 2AX, . . . , nAX, . . . , 2nAX, then: 

M = nAX and <r 2 - 

The proof of this lemma is identical in method to that used in 
Section 99B (p. 387), hence its derivation will be left as an exercise 
for the student. 

Let us consider then the expansion: 

( 2 + 2 ) = “b 2 nCl + 271 C 2 + * • * + 2 nCn + * * * + 2nC n +r + * * * + 1] 

Let us plot the terms of this expansion as ordinates at equal 
intervals AX along the X-axis beginning with the first term at the 

C 

origin. The maximum term is evidently which we erect at the 


Figure 55 




DERIVATION OF THE EQUATION 


399 


mean, O'. We plot the other terms with respect to this new origin. 
Evidently Ax = AX. Let P(x, y) and Q{x + Ax, y + A y) be the 
successive vertices of the polygon which are determined by the rth 
and the (r + l)th terms from the middle term of the above expansion. 
Then the ordinates of the points are: 


V = 


2nV^ n-f-r 
2 2n 


and y + Ay = 


2 nl/ w+r-f 1 

22n 


Since 

( 71/ — T \ 

- ^ ) (see Exercise 13 on p. 374) 

we have: 

Ay _ y_ / - 2r - 1 \ 

Ax Ax \n + r + 1/ 

The abscissa of P is x = rAx; hence: 


r = 


Consequently: 


Ay = _ 


Ax 


C 


Ax 

2x + Ax 


nAx + xAx + Ax 


) 


From the lemma above we have: 

nAx 

Therefore: 

Ay = 

Ax 


y 


C 


2o- 2 , a constant 
2x + Ax 


,2cr 2 + xAx + 


Ax 2 ) 


Now let n become infinite and Ax approach zero. 1 We then have 

dy __ xy 
dx <T 2 

which, upon integration, reduces to: 

y - Ce"~2cF» = Ce~ h '* 


where h( = — ) 
\ crV2/ 


is called the mdex 0 / precision. 


1 See footnote page 397. 



400 


POINT BINOMIAL AND NORMAL CURVE 


In order to make this curve statistically useful, we shall assume 
that the area under the curve is equal to the area of the histogram, 
Nw } where w is the class width and N is the total frequency. That is, 
we assume 



= Nw 


from which it follows, using the well-known relation: [Ex. 1, p. 404] 


£ 


e~ x *dx = V 7r 


C = 


Nw 
o' Vita- 


Substituting this value, we have the equation to the normal frequency 
curve: 


y = 


Nw 

<jV2tt 


e 2<r 2 


(4) 


It must be emphasized that in equation (4) x is the deviation of 
the frequency y or f{x) from the mean. By replacing x by its equal 
X — M we may express the equation in the form: 


Y = 


Nw 

<rV2r 


(X -M ) 2 
2flr 2 


(5) 


If in (4) we make the area under the curve equal to unity, the equa- 
tion reduces to the normal probability curve: 


y = 


l 

crV 2w 


e 2 a 2 


( 6 ) 


which gives the probability of any deviation x. 

It is customary, due to the simplicity of application, to express the 
deviations in standard units, that is, to make a the unit for measuring 
deviations. If in (4) and (5) we place 



DERIVATION OF THE EQUATION 


401 


we obtain: 

Finally we write: 
where 

(Read: phi of tee.) 


Nw _£? 
y = T== e a 

<7V27T 


Nw , 

• = —m 




(7) 

(8) 
(9) 


103. SOME PROPERTIES OF <j>{t) 1 

Values of 0(0 and of the areas bounded by 0(0, the £-axis, and 
certain ordinates are tabulated in Appendix B. The graph of 0(0 
is shown in the accompanying figure which is drawn from the values 
in Table 92. 

Figure 56 

0(0 



etc . etc. X=M—cr X—M X=M +cr etc. etc. 

Since — t yields the same value to 0(0 as + t ) that is, since 
0( — 0 — 0(O> the curve is symmetrical with respect to the vertical 

1 Several of these properties require the calculus for proofs. 



402 


POINT BINOMIAL AND NORMAL CURVE 


line through t = 0. It is therefore not necessary to tabulate negative 
values of t. Since the total area under is 1.0000, the area on 
either side of the vertical line of symmetry is 
0.5000. Therefore the median coincides with the 
mean. The largest value of <p(t) is that for which 
t = 0, therefore the mode coincides with the 
mean. There is no finite value of t for which 
<j HO = 0, but c t>(t) is relatively small for values of 
t outside of t = ± 3. It is because of the last- 
mentioned fact that the normal curve can be used 
to represent finite distributions. As a matter of 
fact the combined area of the two tails beyond 
t = — 3 and t = + 3 is only 0.0026, and the 
combined area of the two tails beyond t = — 4 and t = + 4 is 
0.000,064. The curve crosses its tangent at t = ± 1, <t>(t) = .2420. 
These are called inflection points. 

The areas of certain portions of </>(0 are so important in statistical 
analysis that we must not fail to emphasize them. We shall use the 
symbol A^Z^ or, more briefly, A^ a to mean “the area under 
from t = a to t = b” Thus, we have from the table A = .3413, 
A^Z] l — .4773, A^ = .4987. By the simple addition and subtrac- 
tion of areas we also have 

AjJ = .1360, A,J 2 = .0214, A = .3413, A^_\ = .8186. 

The statement A ^ = .3413 means that between the ordinates 
erected at t = 0 and t = 1 is included 34.13 per cent of the total 
area under the curve. More broadly interpreted, it means that for 
a normal frequency distribution about one-third of the total frequency 
is found between the mean and x = a (see p. 135). In the language 
of probability, the statement means that the chance is approximately 
1/3 that a measure selected at random from a given distribution of 
variates normally distributed will fall within the interval between 
t = 0 and t = 1, or between x = 0 and x = a or between X = M 
and X = M + <r. 

It will be left as exercises for the student to interpret the other 
areas illustrated above. 


Table 92 


t 

4>(f) 

0 

.3989 

0.5 

.3521 

1.0 

.2420 

1.5 

.1295 

2.0 

.0540 

2.5 

.0175 

3.0 

.0044 



SOME PROPERTIES OF 


403 


The value of t that satisfies one of the equations 

A*y 0 = .2500 A^] + _\ = .5000 (10) 

defines one of the most important concepts found in statistics. The 
value of t defined by either of the given equations (10) is called the 
probable error , E , of a single observation. The probable error, E y 
is that distance which, when laid off on either side of the mean of a 
normal curve, defines an interval such that, if ordinates are erected 
at its end points, the area included by the ordinates, the curve, and 
the base line is one-half the total area under the curve. Stated some- 
what differently, the probable error of a distribution of variates 
normally distributed may be defined as that deviation on either side 
of the mean within which exactly half the variates lie. Since half the 
total frequency lies within the interval M — E to M + E, if any 
one variate be selected at random from the N given variates there 
is an even chance that the selected variate falls within the given 
interval M — E to M + E or without it. 

For an approximate solution of equation (10) let us interpolate be- 
tween t = .67 and t = .68. The solution is: 



2 = .0014 
.01 .0032 

2 = .0044 

and t — .67 + z = .6744. More extended tables lead to the more 
accurate value 


and therefore 
that is: 


x 

t = - = .6745 (approximately) 

x — .6745(7 

Ex = .6745cr j 


( 11 ) 


If a distribution is not normal, its probable error is estimated by 
equation (11). 

Figure 57 will assist in clarifying the concept of probable error. 
The values of the moments of the normal curve are given in 
Exercise 8, page 405. 



404 POINT BINOMIAL AND NORMAL CURVE 

Figure 57 



EXERCISES 

1. Find the portions of the area under 4>(t) indicated, and draw a figure 
in each case. 

a. A*]_l d. A*]“ g. 

b. Ad- 2 A e - A<d-2 A k - ^^-2 4 

A "I 2.389 f A 1 1°° • a “1-2.748 

C. -A0J_2 4 A< AJ2.389 ^J- 3.468 


2. Find tin the following equations: 

a. A^Jo = - 4838 «• = .4510 

b. Ad-\ = -4844 d. Ad- \ = -4878 

3. Verify the percentages of Figure 57, in which E is taken as the x-unit. 
The following exercises are for students of calculus. 


4. Prove: 
Hint: Let 





SOME PROPERTIES OF 0(0 


405 


which is the volume under the surface z = e Change to polar co- 

pir /%oo 

ordinates. Then P = 4 / 2 / e^rdrdO = 7 r. 

«/o J o 

5. Show that y in (4) has a maximum at x = 0. 

6. Show that y in (4) has inflection points at x — ±u. 

7. Consider equation (4). Show that the mean deviation about the 

“ = kfj** = itfo *y dx = \/l° = 0-79788 • • • <r. 

8. Evaluate the moments of the normal curve (4), where m* = 



x'y dx . 


That is, show that 


Mo = 1, Mi = 0, M 2 = <r 2 , M3 =0, M4 = 3 m 1 = 3cr 4 

a 0 = 1, ol i = 0, «2 = 1, a 3 = 0, a 4 = 3 

- 1 • 3 ■ 5 ■ ■ • <2» - I) - |gl 

#2n-f-l == 0 

9. Show that for the normal curve 

Mean Deviation about M = 1.183 Probable Error 
Probable Error = 0.8454 Mean Deviation 


104. ILLUSTRATIVE EXAMPLES 

Example 1. Given a normal distribution with N = 1,000, w = 2, 
M = 16, and <r = 4: a. How many variates fall between X = 12 and 
X = 20? b. How many lie above X = 26? c. How many lie below 
X = 10? 

Figure 58 




406 


POINT BINOMIAL AND NORMAL CURVE 


Figure 58 shows a normal curve with M = 16, er — 4, area == (1000)2. 
Since our tables are expressed for values of t , we must transform our data 
into t units. We have shown three scales on the base line. If X = 12, 
x = X — M = 12 — 16= — 4 and t — x/o — — 4/4 = — 1. Similarly, 
if X = 20, t = 1; if X = 26, t = 2.5, and if X = 10, t = - 1.5. 

a. Now = .6826 

This means that 68.26 per cent of the total area under the curve lies between 
l ~ — 1 and t = 1, or between X = 12 and X = 20. By means of the 
calculus it can be shown that the area under Y from Xi to X 2 or under 
y from Xi to z 2 is Nw X the area under between h and < 2 , that is: 
= See equations (4), (5), and (8). Therefore: 

Ay^xZ u = .6826(1000)2 = (682.6)2 

Since 

2,000 units of area represent 1,000 variates, 

(682.6)2 units of area represent 682.6 variates. 

That is, 682.6 variates fail between X = 12 and X = 20. 

In short, since = .6826 ' 

we may say that 68.26 per cent of N or 

.6826(1,000) = 682.6 

variates fall between X = 12 and X = 20. 

b. Similarly, since ^^25 = *5000 — .4938 = .0062, 

.0062(1,000) = 6.2 
variates are beyond X = 26. 

c. Since = .5000 - .4332 = .0668, 

.0668(1,000) = 66.8 

variates are below X = 10. 


Example 2. For the distribution described in Example 1, compute Y 
when X = 4, 8, 12, 16, 20, 24, 28. 

Using (5), the equation of the curve is: 


Let: 


Y = 


(1000)2 IS! 

~=r e 32 

4V27T 



a 


X - 16 


cr 


4 



ILLUSTRATIVE EXAMPLES 


407 


Then: 


^(10^.4^ 

4 V2 7T 


Recalling that <f>(— t) = (f>{t), we have the following table of values. 


X 

t 

m 

Y 

4 

- 3 

.0044 

2.2 

8 

- 2 

.0540 

27.0 

12 

- 1 

.2420 

121.0 

16 1 

0 

.3989 

199.4 

20 

1 

.2420 

121.0 

24 

2 

.0540 

27.0 

28 

3 

.0044 

2.2 


Example 3. If 10 coins are thrown, use the normal probability function 
to find the approximate probability of obtaining exactly 7 heads. 

The various probabilities are given by 
the terms of (§ + §) 10 . 

The exact probability of obtaining 7 
heads is given by: 

Pi = io c 7 (mh) 7 = .117 

We may apply the normal curve to 
obtain an approximate value of P 7 . We 
have : 

M = np = 10(1) = 5 
a = vV? = VlOQ)^) = 1.581 

y = — ~=rr c”” 2 <r* = ~<f)(t) gives the probability of any deviation x. 
o' V 2w & 

We seek y for X — 7. But if X = 7, x = X — M-7 — 5 = 2 and 
x 2 

t = - = — — = 1.265. Since 0(1.265) = .1792, we have therefore 
<r 1.5ol 

1 1792 

» - ^< 1 - 265 > - rsi - 113 



The slight discrepancy in the two results is an evidence that the point 
of the given binomial is near the normal curve. 



408 


POINT BINOMIAL AND NORMAL CURVE 


Example 4. Given a normal distribution with M = 75 and a = 8, 
what limits will include the middle 75 per cent of the total frequency? 

We must solve the equation: 
AJ]_ x x = JbNw 
or the equation 

A^]_\ = .75 

Since 

A^-l = 20i*X = -75, 

we have: 


t = - = 1.15 
a 

and therefore: 

z = 1.15c- = (1.15)8 = 9.20 

Hence the limits are M ± x = 75 ± 9.20 = 65.80 and 84.20. 

In approximating a sum of the successive terms of the point bi- 
nomial by the normal curve, we must find the area under the ap- 
propriate part of the curve. The sum of the successive terms of the 
binomial equals the sum of the areas of the corresponding rectangles 
of the histogram. We must then replace the rectangles of the histo- 
gram by corresponding areas of the curve and this requires that we 
use whole rectangles , not half rectangles at the ends. 

It is evident that the normal curve will give a close approximation 
to the sum of the terms of a binomial only when p and q are nearly 
equal, and n is fairly large. Certainly if there is considerable skew- 
ness, the approximation by the normal curve may not be satisfactory, 
especially near the ends of the distribution. We cannot make definite 
statements as to when the normal curve may be used as an ap- 
proximation to the binomial. Whether the approximation is satis- 
factory or not depends upon the accuracy of the results desired and 
how the approximation is to be used. 

Exercise 6 . If 10 coins are tossed, what is the probability of getting 
4, 5, 6 or 7 heads? (a) Use the theorem of repeated trials for an accurate 


A*lo = - 375 



x — x Ox x 

t — tot t 

From the tables 



ILLUSTRATIVE EXAMPLES 409 

result correct to two decimals, and (b) use the normal curve to find an 
approximate result. 

Solution to (a)^ By the theorem of repeated trials the required proba- 
bility is the sum 2)i 0 Cx(%) x (h) l0 ~ x . This sum is 



Approximate P = A 4 = .3289 + .4430 = .7719. 


105. ON THE SIGNIFICANCE OF RESULTS 

It has been observed that for a normal or a moderately skewed, 
mound-shaped distribution the total range seldom exceeds six times 
the standard deviation. If, then, a distribution is approximately 
normal, it is not expected that a measure chosen at random will show 
a variation of more than three times the standard deviation, on either 
side, from the mean. A divergence of more than ± 3 a (about 
=fc 4.5 E) may be called significant; that is, other forces than mere 
chance have most probably operated to bring about abnormal re- 
sults. Thus if 400 coins are tossed (or if one coin is tossed 400 times) 
what is the allowable variation in the number of heads? We have: 

M = np = 400 (£) = 200 = the expected number of heads 

and 

o- = Vnpq = V400(J)(£) = 10 
3(7 = 30 

It is very improbable then that less than 170 (= 200 — 30) and 
more than 230 ( = 200 + 30) heads will appear. In fact we can meas- 



410 POINT BINOMIAL AND NORMAL CURVE 

ure the probability in question. Since = .9974, if 400 coins 

are tossed, the probability of obtaining between 170 and 230 heads 
is 9,974/10,000. That is, the probability of obtaining more than 
230 and less than 170 heads is 26/10,000. In other words, the odds 
in favor of obtaining between 200 =t 30 heads are 9,974 to 26 or 
383.6 to 1. 

In general, we may state that the probability of a measure’s lying 
within the range M ± 3(7 or M ± 4.5# is 9,974/10,000 and that 
the odds favoring a measure’s lying within this range are nearly 
385 to 1. 

Another type of language has become fashionable when speaking 
of certain t or x values in connection with the normal curve. It is 
seen from our tables that ^ = .95 and thus 5 per cent of the 

area lies outside the limits t = ± 1.96 or x = db 1.96(7. Conse- 
quently, there is 1 chance in 20 that x may lie outside db 1.96(7. 
This value 1.96(7 is called the 5 per cent level of significance . Similarly, 
^<J -2 576 = -99 and thus 1 per cent of the area lies outside the 
limits t = ± 2.576 or x = ± 2.576(7. Consequently, there is 1 chance 
in 100 that x may lie outside =b 2.576(7. This value 2.576(7 is called 
the 1 per cent level of significance. These values may be called 
confidence limits , the probability giving a measure of confidence that 
an item falls within the stated limits. 

The question, 44 At what probability level does a deviation become 
significant?” is one that cannot be answered with scrupulous exact- 
ness. Statisticians differ in their credulity. Any level that is set is 
arbitrary. Conceivably, a deviation x may be any amount. How- 
ever, the occurrence of the deviation may be so unlikely that it can 
hardly be looked upon as due to chance. Some authorities state 
that if x is outside the 5 per cent level it is significant ; if it is outside 
the 1 per cent level, it is highly significant. A safe procedure for the 
student is that he be prepared to state in terms of probability, or 
as a percentage, the level of significance for any deviation. 

Questions. What are the values of t and x for the 10 per cent level 
of significance? 

What are the values of t and x for the 25 per cent level of significance? 

What is the per cent level of significance of a deviation t = ± 3 or 
x = db 3cr? 



ON THE SIGNIFICANCE OF RESULTS 


411 


EXERCISES 

1. In a coin-tossing experiment in which a coin was tossed 400 times, 
250 heads appeared. Do you believe that the experiment was honestly 
performed? 

2. Suppose that the mortality statistics for a large group of cities show 
the average death rate from tuberculosis to be 196.5 per 100,000 population, 
and or = 14. A particular city showed a death rate from tuberculosis of 
110.3 per 100,000. Is this surprising? Another city (a haven for tuber- 
culosis patients) showed a death rate of 245 per 100,000 for the same 
disease. Is this surprising from the point of view of mere chance? 

3. A coin was tossed 100 times. Find, using the normal curve, the 
probability of obtaining exactly 60 heads. 

4. In a college the 12 grades A + , A, A— ; B + , B, B— ; C+, C, C — ; 
D, E, and F are given. On the assumption that ability in mathematics 
is normally distributed, how many in a group of 1,000 grades should re- 
ceive each grade mentioned? Assume that the total range is M d= 3.6o\ 

6. (Thur stone) Construct three frequency curves on the same sheet 
according to the following specifications. Indicate an ordinate at the mid- 


point of each class interval. 




Curve 

a 

M 

N 

w 

A 

15 

50 

400 

10 

B 

15 

50 

800 

10 

C 

15 

50 

1,200 

10 

6. Construct three frequency 

curves 

on the same 

sheet according to 

the following specifications. Compute ordinates for each half-sigma. 

Curve 

<T 

M 

N 

w 

A 

5 

50 

1,000 

10 

B 

10 

50 

1,000 

10 

C 

15 

50 

1,000 

10 


7 . Draw a normal curve <£(£) and divide the base line into five parts 
such that when ordinates are erected at the points of division the five areas 
will be equal. 

8. A normal distribution has the following constants: N = 1,000; 
w = 5; M = 73.64; a = 8.3. How many variates are between X = 61 
and X = 94? 

9. Determine whether it is expected that one will obtain: 

a. 2,048 heads in 4,040 throws of a coin. 

b. 3,300 heads in 6,400 throws of a coin. 

c. 38,024 appearances of a four, a five, or a six in 78,000 throws of a 
single die. 



412 


POINT BINOMIAL AND NORMAL CURVE 


10 . Compute the ordinates for the point binomial (J + §) 16 and com- 
pare them with the ordinates of a superimposed normal curve. 

11 . If a baseball player has a batting average of 0.300, what is the 
probability that he will hit safely at least 25 times out of 100 times at bat? 
Estimate by the normal curve. Note that a 3 is small. 

12 . If 16 coins are tossed, what is the probability of getting 5, 6, 7, 
8, 9, 10, 11, or 12 heads? (a) Use the theorem of repeated trials for a result 
correct to two decimals, and (b) the normal curve for an approximate 
result to two decimals. 

13 . The probability of a man of age 56 dying within a year is 0.02. 
If an insurance company has 10,000 policies in force on men of this age, 
find the probability of the company’s having to pay less than 180 death 
claims; more than 220 death claims. Estimate by the normal curve. 
Note that a 3 is small. 

14 . A large number of students were measured as to height and for 
them we found M = 67.5 inches. We found that 40 per cent of the 
students were between 66.2 inches and 68.8 inches in height. What is the 
standard deviation of the heights? 

15 . In the United States in 1930, 12 per cent of the marriageable men 
were widowers. Assume this situation normal. A city has 6,000 men who 
are marriageable (single men 15 years old and over), (a) How many would 
you expect to be widowers? Note that is small, (b) Estimate the 
probability that there will be as few as 600 widowers, (c) As many as 
750 widowers. 

16 . The experience of a manufacturing concern has been that in the 
past they have had to discard 5 per cent of the units inspected as de- 
fective. A sample of 1,000 units is up for inspection, (a) How many 
defective units would you expect? (b) What are the values at the 5 per 
cent level of significance? 

17 . In 1930, about 9 per cent of the people of the United States were 
“20 and under 25” years of age. In a typical city of the United States 
of population 10,000, how many would you expect to find between 20 
and 25 years of age? Adopting =b 3cr as the limits of reasonable chance 
occurrence, would you be surprised to find as few as 800? As many 
as 1000? 

18 . (Waugh) In an epidemic of infantile paralysis which took place 
in the eastern part of the United States in the fall of 1931, we have records 
on 927 children who contracted the disease. Of these, 408 received no 
serum and 104 of the 408 became paralyzed, while the other 304 recovered 
without paralysis. If the serum had no effect, how many cases would you 
have expected among the 519 who were given serum? (Assume 3<r marks 
the limit of reasonable chance occurrence.) Actually 166 of the children 
receiving serum were paralyzed. What do you conclude as to the efficacy 
of the serum? What other factors might influence the result besides the 
effect of the serum? 



ON THE SIGNIFICANCE OF RESULTS 


413 


19 . A group of 1,000 students took an objective and standardized test. 
The distribution was closely normal with M — 60 and <r = 10. What 
are the values of Q lf Q 3 , Q, M.D., a 3 , a 4 , and the 87th percentile? 

20 . It has been established that of children under one year of age who 
are afflicted with whooping cough about 50.5 per cent recover. A hospital 
has 27 children less than a year old who are afflicted with this disease. 
Establish the 5 per cent level of significance as to the number of re- 
coveries and state carefully what you have found. 

21 . In the registration area of the United States in 1931, 51 per cent 
of the births were males. In a certain city in 1931, 100 babies were born, 
(a) What is the probability of as few as 45 females? (b) As many as 
60 females? (c) What is the probability of exactly 45 females? (d) What 
is the probability of exactly 60 females? 

106. GRADUATION OF A DISTRIBUTION BY THE 
NORMAL CURVE 

In this book we have frequently called attention to the fact that 
the distributions of observed data that we have analyzed are samples 
of a larger population or universe. It has been pointed out that the 
irregularities of the distributions may be due to a paucity of the 
data or to fluctuations in sampling. The frequency curve is assumed 
to represent generalized experience of data of a given type on the 
assumptions (1) that N has been greatly increased and (2) that the 
class intervals have been indefinitely diminished. By fitting a curve 
to the observed data we have opportunity to compare observation 
with idealization and to note the variations due to sampling. 

If a mound-shaped frequency distribution is reasonably symmetri- 
cal, the normal curve may approximately represent it. Of course if 
a distribution is decidedly skew, a normal curve is not expected to 
fit the data. Our problem in this section is to explain the steps in 
determining the theoretical frequencies of a distribution, assuming 
that they follow a normal curve. As was implied in the derivation 
of the normal curve we assume that: 

1. The mean and the standard deviation of the curve are equal to M 
and (Tad i. of the observed data. 

2. The area under the curve equals the area of the histogram. 

It follows from the first assumption that the first step in fitting a 
normal curve to a distribution of observed data is to compute M 
and cr adj. 



414 POINT BINOMIAL AND NORMAL CURVE 

A. Graduation by Ordinates. The following table of the gradua- 
tion of the distribution of the heights of colored soldiers (see p. 168) 


Table 93. Graduation by the Normal Curve: Ordinates 


X 

(1) 

Observed 

fix) 

(2) 

W ^ 

W 1 

* 

t = — 

<r 

(4) 

4>(t) 

(5) 

Theoretical 

f(x) = 1894.6346^(0 
(6) 

148.5 

2 

- 23.39 

- 3.44 

.0011 

2.1 

150.5 

9 

21.39 

3.15 

.0028 

5.3 

152.5 

13 

19.39 

2.85 

.0069 

13.1 

154.5 

23 

17.39 

2.56 

.0151 

28.6 

156.5 

56 

15.39 

2.26 

.0310 

58.7 

158.5 

88 

13.39 

1.97 

.0573 

108.6 

160.5 

162 

11.39 

1.68 

.0973 

184.4 

162.5 

318 

9.39 

1.38 


291.8 

164.5 

468 

7.39 

1.09 


417.4 

166.5 

564 

5.39 

0.79 


553.2 

168.5 

665 

3.39 


.3521 

667.1 



- 1.39 

- 0.20 

.3910 

740.8 

172.5 I 

749 


+ 0.09 


752.7 

174.5 

747 

2.61 

0.38 

.3712 

703.3 

176.5 

586 

4.61 

0.68 

.3166 

599.8 

178.5 

469 

6.61 

0.97 

.2492 

472.1 


314 

8.61 

1.27 

.1781 

337.4 

182.5 




.1182 

224.0 

184.5 

133 

12.61 

1.85 

.0721 

136.6 

186.5 

70 

14.61 

2.15 

.0396 

75.0 

188.5 

38 

16.61 

2.44 

■PltKfl 

38.5 


22 

18.61 

2.74 

wmm 

17.8 

192.5 

15 



■ 

7.8 

194.5 


22.61 

3.33 


3.0 

196.5 

3 

24.61 



1.1 

198.5 

2 

• 26.61 

3.91 


0.4 

Total 

6,441 




6,440.6 


will show the steps in the process. For the distribution in question 
we have previously computed M = 171.89, <r a dj. = (3.3996)2. Apply- 
ing equation (8), the theoretical frequencies are given by: 

y - WMkM ~ 1894 . 6346 ^( 1 ) 

The values of t which correspond to the given values of X are most 
easily found by multiplying x by 1/cTod,-., and in this case: 



GRADUATION BY NORMAL CURVE 


415 


= 0.147076 

Gad,]. 

The following steps are recommended as the proper procedure in 
fitting a normal curve by ordinates. 


1. Compute M, <r a d and 1 /a a dj. 

2. Using equation (8), write the equation of the theoretical frequen- 
cies. 

3. Write down columns (1) and (2), giving class marks and frequencies, 
of the table upon which the computations are to be carried out. 

4. Compute values of x for column (3). 

5. Compute values of t for column (4). 

6. Write down values of <f>(t) from the table in Appendix B. 

7. Compute the theoretical frequencies from the equation found in 
step 2. 

B. Graduation by Areas. The graduation of a distribution by 
areas depends upon a few notions that we have not yet sufficiently 
clarified. Since [see page 406] 

ArK - 4,]: - I Nw ■ Aj; 
and further, since 
Nw units of area 

represent N variates, 

then 

Nw • units of area 
represent N • variates. 

We shall indicate the increment t scale 0 t\ t 2 

of area under between t\ 

and t 2 by A A. The theoretical frequencies will then be computed by 
N • AA. 

By this means we are able to find the theoretical frequencies of 
the various classes (to which the incremental areas under the curve 
correspond) and compare them to the observed frequencies (to 
which the rectangular areas of the histogram correspond). That is, 
we compare, for example, the areas Xi ABX 2 and X 1 CDX 2 , or the 
frequencies which they represent. 




416 


POINT BINOMIAL AND NORMAL CURVE 
Table 94. Graduation by the Normal Curve: Areas 


Class 
lower 
limit : l x 
(1) 

Observed 

fix) 

(2) 

l x - M 

(3) 

t _ h - M 

(Todj. 

(4) 

40^—00 

(5) 

AA 

(6) 

Theoretical 
fix) - 
N ■ A A 
(7) 

147.5 

2 

- 24.39 

- 3.59 

.0002 

.0003 

1.9 

149.5 

9 

22.39 

3.29 

.0005 

.0008 

5.2 

151.5 

13 

20.39 

3.00 

.0013 

.0022 

14.2 

153.5 

23 

18.39 

2.70 

.0035 

.0045 

29.0 

155.5 

56 

16.39 

2.41 

.0080 

.0090 

58.0 

157.5 

88 

14.39 

2.12 

.0170 

.0174 

112.1 

159.5 

162 

12.39 

1.82 

.0344 

.0286 

184.2 

161.5 

318 

10.39 

1.53 

.0630 

.0463 

298.2 

163.5 

468 

8.39 

1.23 

.1093 

.0643 

414.2 

165.5 

564 

6.39 

0.94 

.1736 

.0842 

542.3 

167.5 

665 

4.39 

0.65 

.2578 

.1054 

678.9 

169.5 

708 

2.39 

0.35 

.3632 

.1129 

727.2 

171.5 

749 

— 0.39 

- 0.06 

.4761 

.1187 

764.5 

173.5 

747 

+ 1.61 

+ 0.24 

.5948 

.1071 

689.8 

175.5 

586 

3.61 

0.53 

.7019 

.0948 

610.6 

177.5 

469 

5.61 

0.83 

.7967 

.0719 

463.1 

179.5 

314 

7.61 

1.12 

.8686 

.0521 

335.6 

181.5 

207 

9.61 

1.41 

.9207 

.0357 

229.9 

183.5 

133 

11.61 

1.71 

.9564 

.0209 

134.6 

185.5 

70 

13.61 

2.00 

.9773 

.0120 

77.3 

187.5 

38 

15.61 

2.30 

.9893 

.0059 

38.0 

189.5 

22 

17.61 

2.59 

.9952 

.0028 

18.0 

191.5 

15 

19.61 

2.88 

.9980 

.0013 

8.4 

193.5 

10 

21.61 

3.18 

.9993 

.0004 

2.6 

195.5 

3 

23.61 

3.47 

.9997 

.0002 

1.3 

197.5 

2 

25.61 

3.77 

.9999 

.00008 

0.5 

199.5 

0 

27.61 

4.06 

.99998 

.00000 

0.0 

Total 

6,441 





6,439.6 


We shall illustrate the procedure by graduating the distribution of 
the heights of colored soldiers (see Table 94) for which we have found: 

M = 171.89, <Tadj. = (3.3996)2, and — = .147076.' 

O adj. 

1 The question that naturally presents itself to the thoughtful student at this 
point is: What is the criterion to determine the goodness of fit of a theoretical 
curve to an observed distribution? We regret that the answer to this important 
question takes us beyond the scope of this text. We can refer the reader to page 
78 of Rietz and others, Handbook of Mathematical Statistics, and to Karl Pearson’s 
Tables for Statisticians, Pt. I. These references will give a brief discussion of 
Pearson’s Chi-square test. For fuller information we refer the reader to Pearson’s 
original paper in Philosophical Magazine , Vol. 50, ser. 5 (1900), pp. 157-75. 



GRADUATION BY NORMAL CURVE 


417 


In the graduation of a distribution by the normal curve, using 
areas, we shall find it convenient to follow the following steps. 

1. Compute My <Tadj., and 1 /<Todj. 

2. Write down columns (1) and (2) of the table giving lower class-limits 
and frequencies. Note that the classes are defined by their lower 
limits, l x . 

3. Express the lower limits as deviations from M: l x — M. This 
gives column (3) of the table. 

l x — M 

4. Express the deviations from M in units of t: t — — This 

O ’ ad j. 

gives column (4) of the table. 

5. Using table of <t>(t) in Appendix B, prepare column (5) of the table: 

It will be noted that the desired areas are found by subtracting the 
values in the table from 0.5000 for t < 0, and by adding the values 
in the table to 0.5000 for t > 0. 

6. By subtracting each area in column (5) from the area immediately 
beneath it we compute AA. This gives column (6). 

7. Compute the theoretical frequencies, N • A A. 

EXERCISES 

1. Graduate by ordinates and by areas the distribution of chest measure- 
ments which is given in Exercise 10, page 168. 

2. Graduate the distribution of the heights of college men given in 
(a) of Exercise 1, page 54. Use areas. 

3 . Graduate by areas the distribution of the head breadths given in 
Exercise 2, page 54. 

4 . Find the equation of the distribution of pulse beats which is found 
in Table 29 (p. 165), assuming normality. 

6. Plot the normal curve and the frequency polygon for the theoretical 
and the observed distributions given in Table 93. Do the same for the 
distributions in Table 94. 


MISCELLANEOUS EXERCISES 


1 . 


If Vx 


1 

CTx ^ 27 T 


(X - MX)' 

2 <ty* ’ show that Y AX 



2. If Y x has the value given in Exercise 1, find Y AX + b in terms of Y x . 

3. Three per cent of all children are left-handed. In a group of 
1,000 children what is the probability that as few as 20 will be left-handed? 
That as many as 40 will be left-handed? Establish the number of children 
at the 5 per cent level of significance. 



418 


POINT BINOMIAL AND NORMAL CURVE 


4 . Based upon the Mendelian hypothesis, it is expected that, on 
crossing a certain type of pea, 25 per cent of the seeds will be green. An 
experiment on this type of pea gave 4,960 yellow and 1,840 green seeds. 
Is the divergence within the 5 per cent level of significance? 

6. Show that the frequency curve 



is symmetrical. 

6. Plot on the same axes frequency curves of the form given in Exer- 
cise 5 when (1) a = 5, p = 2; (2) a = 5, p = 10; (3) a = 5, p = 100. 
Assume y 0 = 100 in each case. 

7 . Show that as a and p increase without limit but in such a way that 
a/p is constant and equal to A 2 , the curve given in Exercise 5 approaches 
the normal form 

y = y 0 e -« /m 

8 . We replace the single constant a in Exercise 5 after factoring by 
ai and a 2 thus obtaining 

which is skew. Plot on the same axes this curve when (1) ai = 4, a 2 = 5, 
p = 10; (2) ai = 10, a 2 = 5, p = 0.3. Assume y 0 = 100. 

9. Show that as a 2 increases without limit, cu remaining constant, 
the formula in Exercise 8 approaches the form 

/ x \ pa x 

y==y °\ 1+ aJ e ~ P * 

10 . Draw the curve in Exercise 9 when y 0 = 25, ai = 12, p = 1.3. 



Chapter 13 

THE THEORY OF SAMPLING: MEASURES 
OF RELIABILITY 

107. INTRODUCTION 

We may regard the numerical description of any mass of statistical 
data from two points of view. We may regard the description as an 
end in itself, a mere summary of our measurements, or we may 
regard it as a sample drawn from a larger group which we call the 
parent population or the universe . 

Usually, the larger point of view obtains, that of forming judgments 
of the universe from a study of the sample. In some cases it is 
impossible to measure the entire universe, and in other cases it is 
impracticable to do so. Even if such a goal as measuring the entire 
universe was possible of attainment, the added expense in time and 
labor would be an unnecessary luxury . For, by carefully selecting a 
sample, excellent estimates of the statistical parameters of the uni- 
verse can be obtained. 

The statistician is, therefore, generally forced to work with samples. 
We compute the mean of the sample and use this mean as a basis for 
estimating the mean of the universe. Similarly, we use the dispersion 
of the sample as a basis for estimating the dispersion of the universe; 
and so on. Naturally, we must then attempt to state the degree 
of confidence we can attach to our estimates. This we do in terms 
of probability. 

It is obvious that in order to make a good estimate of the universe, 
we must have a good sample, a representative sample. Securing 
such a sample is not always an easy task, but generally it can be 
done. The procedures employed in securing such samples are be- 
yond the scope of this book. In what follows, when we use the term 
sample, we mean a statistical sample wherein any one individual in 
the parent population is just as likely to be included as any other. 
Such a sample is often called a random sample . 

419 



420 


THE THEORY OF SAMPLING 


This process of generalizing statistical results, of making inferences 
regarding the universe from the study of the sample, is called sta- 
tistical induction. Obviously, it is a problem of supreme importance. 
Karl Pearson has called it “the fundamental problem of practical 
statistics.” 


108. THE PROBLEM OF THIS CHAPTER 

We have spent no little time in the preceding chapters with ques- 
tions relating to the numerical description of a mass of data as an 
end in itself. We have seen that it is possible to describe succinctly 
a mass of numerical data. The essence of the data may be condensed 
to four measures: (1) the mean, (2) the dispersion, (3) the skewness, 
and (4) the excess. For example, given the measurements of the 
heights of 1,000 men, we are able to give a numerical description of 
the 1,000 measurements. They may show an arithmetic mean of 
67.5 inches, a standard deviation of 2.5 inches, a coefficient of skew- 
ness, a 3 , of 0.036, and an excess, a 4 — 3, of 0.123. If our problem 
is limited to a characterization of the 1,000 measurements, our 
problem is fairly completely solved. In characterizing a mass of data 
by means of a few statistical constants, we are able to comprehend 
the significant facts of the mass which might not otherwise be possible. 

If we adopt the second and broader point of view and consider 
the 1,000 measurements as a representative sample and are concerned 
with using the properties of the sample in order to make inferences 
about the parent population from which the sample is chosen, it is 
clear that we cannot speak with meticulous certainty concerning 
the computed statistical constants and, as a consequence, our lan- 
guage should be modified. Another sample of 1,000 measurements 
of the heights of men chosen in a similar manner will probably yield 
at least slightly different statistical constants. In other words, these 
so-called statistical constants show variation as we move from sample to 
sample . 

While the statistical constants computed from successive samples 
show variation, it must not be inferred that the variation is unlimited. 
As a matter of fact the statistical constants computed from moder- 
ately large random samples selected from a larger group show an 
uncanny stability. It is due to this remarkable and measurable 



THE PROBLEM OF THIS CHAPTER 


421 


stability of the statistical constants computed from sample to 
sample that we may make inferences from a relatively small set of 
observed data. A measure of the stability of a statistical constant is 
often called a measure of its reliability. 

The so-called statistical constant derived from the analysis of a 
sample is frequently called by some writers, following R. A. Fisher, 
a statistic , and the corresponding quantity belonging to the universe 
a parameter. A statistic is thus an estimate of a parameter. For a 
given universe a parameter is fixed but the statistic may vary from 
sample to sample. 

It will be the problem of this chapter to define a range of variation 
about the statistical parameter of the universe within which fluctua- 
tions of the statistics , due to pure chance, may be expected to occur 
according to definite probabilities. It must be borne in mind that 
the variations due to a multiplicity of factors other than pure chance 
can in no way be accounted for by the sampling formulas that we 
shall discuss. The variations we are considering “are the resultant 
effect of a complex of forces which cannot be traced, still less measured, 
and which have been happily described as that ‘ mass of floating causes 
generally known as chance.’” If the variations are greater than 
can be accounted for by chance, the significance of the variation 
should, if possible, be accounted for and explained by the observer. 

We may meet problems that fall into two broad categories. In 
the first category the parent universe may be known and we may wish 
to establish whether or not a statistic of a sample falls within a pre- 
determined range of variation. (In this case the parent universe 
is generally finite.) For example, a manufacturer of some article 
may have examined a large number of a given type of product, 
and thus may have been able to adopt rather rigid specifications for 
the product. A sample is selected for a test. Docs the sample fall 
within the tolerance limits demanded by the universe? 

In the second category the parent universe is unknown and we 
wish to estimate its parameters by finding the statistics of the 
sample, and to measure the reliability (or degree of confidence) we 
may place in the estimates. By far, most problems that occur in 
the applications of the theory of sampling belong in this category. 
In this case the universe is generally considered as infinite. 

In most cases that arise, whether the universe be known or un- 



422 


THE THEORY OF SAMPLING 


known, stated in rather general terms the question is: How well 
does the sample describe the universe? More precisely: How much 
shall we allow the values of the statistical constants obtained from 
the sample to vary to describe the parent universe? 

109. THE STANDARD DEVIATION IN CLASS FREQUENCIES 
Table 95A Table 95B 


(Parent Population) TheoretTca^Frequencies) 


Class 

Fix) 

Class 

fix) 

1 

3,000 

1 

30 

2 

6,000 

2 

60 

3 

13,000 

3 

130 

4 

18,000 

4 

180 

5 

20,000 

5 

200 

6 

19,000 

6 

190 

7 

12,000 

7 

120 

8 

7,000 

8 

70 

9 

2,000 

9 

20 

Total 

100,000 

Total 

1,000 


Suppose the frequency distribution of some single characteristic 
is given by Table 95A. The relative frequencies of the several classes 
are 3/100, 6/100, 13/100, etc. We choose from this homogeneous 
population a sample of 1,000. The “expected” distribution of the 
sample, by Section 95, would be that given by Table 95B. We know 
of course from experience that the theoretically “expected” fre- 
quencies would differ from those that would result from experiment 
just as I know that if I toss a coin 100 times I “expect” 50 heads 
and 50 tails whereas I may actually get 48 heads and 52 tails. And 
from my experience with coin-tossing experiments I am not shocked 
by this result. 

Suppose that we should obtain a large number of samples of 1,000 
observations, each taken under the same essential conditions. A 
class frequency, say that of Class 3, will vary from sample to sample. 
These values will form a frequency distribution. The variations, 
called “variations due to sampling” or “variations due to sampling 
errors,” can frequently be accounted for and explained. Such a 



STANDARD DEVIATION IN CLASS FREQUENCIES 423 


question as, “What is the variation that would occur in Class 3 if 
we obtained a large number of samples of 1,000 observations from 
the population in Table 95 A?” we can answer approximately. 

To answer this question we consider any observation as a trial , 
and a success if an observation falls in the class. Thus the proba- 
bility of an observation falling in Class 3 is p = 13/100, and the 
probability of an observation not falling in the class is q = 87/100. 
And we have the standard deviation of the frequency of this class 
to be theoretically Vl,000(.13)(.87) = 10.6. So that we should 
expect Np ± 3 VNpq or 130 ± 32 observations as setting the limits 
of the frequency of Class 3 of the sample of 1,000. 

If the probable error rather than the standard deviation is taken 
as the measure of the variati on, t hen the probable error of the fre- 
quency of Class 3 is 0.6745V Npq or 0.6745(10.6) = 7.1. Hence, if 
many random samples of 1,000 observations were taken from the 
population of Table A, we should expect theoretically the frequency 
of Class 3 of the sample to fall within 130 ± 7 about half the time. 

If plus and minus three times the standard deviation of the ex- 
pected frequency be taken as the variation in the frequency that 
may be allowed due to sampling, then if many samples of 1,000 ob- 


Table 96C Table 96D 


Class 

/(*) 

Class 

fix) 

1 

30 ± 3(5.4) 

1 

25 

2 

60 ± 3(7.5) 

2 

75 

3 

130 ± 3(10.6) 

3 

175 

4 

180 ± 3(12.1) 

4 

200 

5 

200 rt 3(12.6) 

5 

210 

6 

190 ± 3(12.4) 

6 

170 

7 

120 db 3(10.3) 

7 

80 

8 

70 ± 3(8.1) 

8 

50 

9 

20 ± 3(4.4) 

9 

15 

Total 

1,000 

Total 

1,000 


servations are actually taken from the population of Table 95A, 
we might obtain frequency distributions with the variation in the 
class frequencies as indicated in Table 96C. So that if we were 
sampling from Table 95A and should secure a sample with the 
frequencies given by Table 96D, we would be inclined to suspect 



424 


THE THEORY OF SAMPLING 


that randomness went awry since the frequencies in Classes 3 and 7 
are outside the limits set by Table 96C. 

In general, if the frequency of the fcth class of the parent distribu- 
tion of population S be F k (x ), then the probability of an observation’s 

falling in that class is p k ( = and the probability of the obser- 


vation’s not falling in that class is 




So, when a 


sample of N is chosen the expected frequency of the kth class of the 
sample distribution is Np k with the standard deviation y/Np k q k . 

In applications we do not know the parent population and hence 
the true value of p k is unknown. Let/* (a;) be the observed frequency 
of the kth class of the sample. If N is fairly large we accept f k (x)/N 
as an approximation to p k . Then we have 


i-tp) 

Hence the frequency 1 of the kth class may be written with its 
probable error as 

This means that if a sample of N is taken from some unknown 
parent distribution, the chances arc even that the observed frequency 
of the„fcth class, f k (x ), will not differ from the expected frequency 

of the fcth class by more than dt 0.6745 y//t(x)^l 

If each class frequency of a distribution of N variates is divided 
by N, we obtain a distribution of relative frequencies or percentages . 
As a corollary to the theorem for finding cr /(fc(x) we can immediately 
derive a formula for finding the variation in the relative frequency 
of the kth class , <x Pk ( X) , where p k (x) = f k (x)/N. 

From Exercise 21 on page 148 we have a AX — A<r x . Employing 
this theorem we have 

1 See Rietz, H. L., Mathematical Statistics, pp. 119-122, for a formula which 
gives a closer approximation. 



STANDARD DEVIATION IN CLASS FREQUENCIES 425 


1 1 

-Vs 

where q k (x ) = 1 — p k (x). 

This formula, when used in its broad meaning to measure the 
variation in a percentage, is usually written 



where q = 1 — p. 

Example. Suppose that of a large number of men examined for military 
service about 70 per cent have been accepted. If the same standards are 
imposed in future examinations, what are the limits of percentage accept- 
ances expected from a sample of 1,000? 

Solution. We have p = 0.70 q — 0.30 N = 1,000 

- 0014 ■ lA per cent 

Adopting =t 3a p as the limits of the percentage accepted, we should 
expect the percentage accepted to vary from 70 — 4.2 per cent to 70 + 4.2 
per cent. That is, we should expect from G5.8 to 74.2 per cent of the men 
examined to be accepted. 



110. AN EXPERIMENT IN SAMPLING 


In order to clarify the problem of the sampling process, let us con- 
sider the parent universe of 64 variates distributed according to 
Table 97 ^ lc P 0 ^ binomial 64(-2~ + J) 6 . Table 97 exhibits the 



parent distribution in tabular form. We indicate the 
mean and the standard deviation of the universe by 
M u and cr u respectively. 

For this universe we have: 

M u = up = 6(i) = 3 

o’ u = Vnpq = V6(Jj(jj = 1.225 



In order that we may draw random samples from the given parent 
population we prepare 64 cards in the following manner. On 1 card 



426 


THE THEORY OF SAMPLING 


we write X = 0 and X 2 = 0; on 6 cards we write X = 1 and X 2 = 1; 
on 15 cards we write X = 2 and X 2 = 4; on 20 cards we write X = 3 
and X 2 = 9; and so on for the entire parent distribution. We now 
have a parent population of 64 members, one card for each individual, 
from which we may draw random samples. Suppose we draw samples 
of 10 cards. The remaining 54 cards constitute a sample of 54 cards. 
With each drawing we therefore obtain samples of N = 10 and 
N = 54. The sum of X for all 64 cards is 192 and the sum of X 2 
for all 64 cards is 672. We shuffle the cards well and take a sample 
of 10 cards. We total the values of X and of X 2 on the 10 cards and 
find for the first sample of 10 that 2X = 26 and 2X 2 = 100. For 
the first sample of 10 we now find M = 2.6 and a = 1.8. We thus 
have one sample mean and one sample standard deviation for 
N = 10. For N = 54 we also have 2X = 192 — 26 = 166 and 
2X 2 = 672 — 100 = 572, from which we compute M — 3.1 and 
o * = 0.99. We thus have for N = 54 one sample mean and one 
sample standard deviation. We place the cards again on the pack, 
shuffle them well again, and draw 10 cards, from which we again 
compute the sample means and the sample standard deviations. 
We can continue this process and select as many samples as we 
please. Obviously ^Cio distinct samples can be secured. We show 
below the distributions of 100 actual sample means for the case in 
which N = 10 and the case in which N = 54. We denote by Z any 
sample mean and its frequency by f(z). (See Table 98.) 

Distribution (a), which has 100 sample means, was derived by 
drawing samples of 10 variates from the previously described parent 
population of 64 variates and computing the means of the samples 
drawn. Distribution (b), with samples of 54 variates, was similarly 
derived. Each distribution is therefore a distribution of sample 
means that has its mean (the mean of the means), its standard devia- 
tion (the standard deviation of the means), its skewness (the skewness 
of the means), and so on. It is the standard deviation of the means 
in which we are especially interested, for it gives a measure of the 
variability of the distribution of means. 

We shall leave it as an exercise for the student to verify the follow- 
ing values: N = 10 iV = 54 

M z = 2.997 M z - 3.00 

< Tz = 0.298 <t z " 0.078 



AN EXPERIMENT IN SAMPLING 427 

Table 98 

(a) (b) 


N = 10 N = 54 


z 

/(z) 

z * 

/(*) 

2.2 

1 

3.15 

1 

2.3 

1 

3.13 

1 

2.4 

2 

3.11 

2 

2.5 

3 

3.09 

3 

2.6 

5 

3.07 

5 

2.7 

6 

3.06 

6 

2.8 

9 

3.04 

9 

2.9 

14 

3.02 

14 

3.0 

20 

3.00 

20 

3.1 

13 

2.98 

13 

3.2 

8 

2.96 

8 

3.3 

7 

2.94 

7 

3.4 

4 

2.93 

4 

3.5 

3 

2.91 

3 

3.6 

1 

2.89 

1 

3.7 

2 

2.87 

2 

3.8 

1 

2.85 

1 

Total 

100 

Total 

100 


* These are rounded values. 

Figure 59 represents the curve for the parent distribution and Fig- 
ure 60 the ordinates of the distribution of sample means for N = 10. 


Figure 59 



M u 



428 


THE THEORY OF SAMPLING 
Figure 60 



It will be observed that the sample means are approximately 
normally distributed above and below M z = 2.997, but with a dis- 
persion much less than that of the parent population. In the next 
section we shall derive some theorems that should explain these 
phenomena. 

The following exercises are given primarily to prepare the student 
for a facile reading of the succeeding section. The various numbers 
should therefore be solved in detail. 


EXERCISES 

1. Consider the parent population of 5 variates: X h X 2) X Z} X \ , Z 6 . 
Write down the 10 distinct samples of 3 variates that may be drawn. 
For example, X h X 2 , X 3 ; X lf X 2 , X 4 . 

2 • Let Zi represent the fth sample mean and write down the 10 distinct 
sample means for the parent population in Exercise 1. For example, 

„ (Xi+ X 2 + Xt) 


3 



AN EXPERIMENT IN SAMPLING 


429 


3. Show that for the sample means in Exercise 2: 


M z = 


ZZj _ XXj 
10 5 


= M X 


State in words the theorem of this formula. 

4 . Show that: 

(XX i) 2 = XX] + 22X t Xj 

where 

XXi = X l + X 1 + X i + X< + X< 

5. For the values of Z» found in Exercise 2 show that: 

y i 7 i 1 

= +2X<X,1 


6. Using the relationship in Exercise 4 show that: 

1 

-^-^pszi+GSZO*] 

7 . From equation (7) of Chapter IV we have c z = y — M J. 

Use this relationship and those established in Exercises 3 and 6 above to 
show that, for the distribution of means here considered: 



111. THE DISTRIBUTION OF MEANS 

Let us now consider the general problem of characterizing the dis- 
tribution of sample means derived by drawing samples of N variates 
from a parent population of S variates. Obviously S C N distinct 
samples may be drawn. Each sample has its mean and the s Cn 
samples give us a distribution of sample means. 

We shall undertake to characterize this distribution of means as 
we should characterize any distribution, that is, by finding its mean, 
its standard deviation, its skewness, and so on. 

A. The Mean of the Means. Let the parent universe be de- 
noted by Xi, X 2 , X 3 , . . . , X s . Denoting any sample mean by Z iy we 
have: 



430 


THE THEORY OF SAMPLING 


Zi = ^[X, + X 2 + • • • + X N -1 + Xjf-] 

Z 2 = [Xi + Xt + • • • + Xk-i 4- Xiv+i] 


Z S C N = ^[Xs-V+1 + Xs-N +2 + 


+ Xs-i + Xs ] 


( 1 ) 


We desire to find the mean of this distribution of sample means. 
We must find 2Z -f- s Cn • Note that each parenthesis contains N 
terms and that the s Cn lines contain (N • s Cn) terms which are not 
all distinct. One X occurs as frequently as another. Hence each of 

times. That is: 

M z = ^2 Z, = • S C N 2*,] -T- S C N = = M x (2) 

We may express this important result as follows: 

Theorem: The mean of the sCn sample means formed by selecting 
samples of N variates from a parent population of S variates is equal to 
the mean of the S variates. 

Stated less accurately, we may say that the mean of the distribu- 
tion of means is equal to the mean of the parent universe: M M = M u . 

B. The Standard Deviation of the Means. We shall now proceed 
to find the standard deviation of the S C N sample means with which 
we can measure the variability of the distribution of means. We 
should recall that the standard deviation of the parent universe is 
given by 


and that the standard deviation of the distribution of sample means 
is given by: 



Since M z is known, <j z can be determined if we can find 2Z 2 . 



the S X’s occurs I 



THE DISTRIBUTION OF MEANS 


431 


From equations (1) we have: 

^=^[^+*2+- • • +X£]+^[X 1 X 2 +X 1 X 3 +- . • +X*_, X*] 

Zl = ±tx\ + xl + ■ ■ ■ + Xk-i + XM + |j [Z,X, + x,x 3 

+ • • • +Xiv-.iXisr+i] 


%sCn = Jp [-X’^-JV+I + * * * + Xs] + Jp [■ Xs-N+lXs-N +2 + • • • 

+ Xa^Xsl 

To find the sum of these s Cn squared means we note that the 
sum of the parentheses containing terms of the type X\ maybe found 
as follows: Each parenthesis contains N terms and the s Cn paren- 
theses contain ( N • S C N ) terms which are not all distinct. One 
X 2 occurs as frequently as another. Hence each of the given S X 2 ’s 
occurs (. N • S C N -f- S ) times. 

To sum the parentheses containing the cross-product terms of the 
type X t Xj, note that each parenthesis contains N Ci terms and the 
S C N parentheses contain ( N C 2 • S C N ) terms which are not all dis- 
tinct. One cross-product term occurs as frequently as another. 
Since we can get S C 2 cross-product terms from the given S letters, 
each of the ( N C 2 • sCn) cross-product terms must occur (^2 • sCn 
- r- S C 2 ) times. Therefore: 


and 

Since 
we have: 
Hence: 


2Z? - + »^^2SX,X ( ] 

sz; ■ ip. jv«v-i)„^ yy i 

Mi - Ms zx? + w^iy 2SX,x, J 

(2 XiY = 2X? + 22 XiXj 

22X<X } = (2X,-) 2 - 2X? 

2 Z] S - N 2X 2 , S(N - 1)/2X\* 



432 


THE THEORY OF SAMPLING 


Substituting this in (3), recalling from (2) that M z = M x , we have 
upon simplifying: 


<rz — vx 




S - N 
N{S - 1) 


(4) 


Since, in general, S is very large when compared with N, we can 
obtain a simpler relationship if we assume that S is infinite. For 
this case we obtain 1 


<rz 


vx 

VN 


(5) 


in which, we repeat for emphasis, N is the number of variates in the 
sample and <r x is the standard deviation of the parent universe . 
In fact, in (4) and (5) we may replace <r z and or x by a M and cr M . As 
the constants describing the parent universe are usually not known, 
formulas (4) and (5) are apparently of value only theoretically. 
Since we have stated that it is our problem to make certain inferences 
about the parent universe from a consideration of the sample we shall 
see in a later section how (5) will assist us in doing it. Experiment 
justifies our making the assumption that the standard deviation of the 
parent universe is approximately equal to the standard deviation of the 
sample f the goodness of the approximation increasing as N is increased. 
This assumption makes possible our expressing the formula for the 
standard deviation of the mean in a workable form. We have, 
finally 


& m = (approximately) 


( 6 ) 


where M is the mean of the sample, <r is the standard deviation of the 
sample, and N is the number of variates in the sample. That is: 

the standard deviation _ the standard deviation of the sample 
of the arithmetic mean V the number in the sample 


C. The Probable Error of the Mean. In Chapter 4 we insisted 
that the standard deviation is an excellent measure of the variability 
of a distribution. We also insist that the standard deviation of the 

1 This is easily seen if we divide numerator and denominator of the quantity 
under the radical by S and note that N/S and l/S approach zero as S becomes 
infinite. 



THE DISTRIBUTION OF MEANS 


433 


mean as computed from (6) is an excellent measure of the variability 
of the distribution of means. Tradition, however, has been a potent 
influence in commending the use of the probable error to measure 
the variation in the several statistical constants. In the preceding 
chapter, we defined the probable error of any measure by the equa- 
tion: 

E x = 0.6745(7* 

Therefore, the probable error of the mean is defined by the relation: 

Em = 0.6745<r Af = (7) 

The quantities, <r M and E Mj are frequently used as measures of 
the reliability of the arithmetic mean. Since the smaller the variation, 
the greater the reliability, a small standard deviation of the mean 
or a small probable error of the mean means “accurate shooting.” 
It is therefore evident from (6) and (7) that the smaller the cr M or E M) 
the greater the reliability in M. 

The language of variation used in the preceding paragraph is 
inverse. We can make the variation direct if we adopt the measure, 
ft, as is done in the theory of errors, for the index of precision where ft 
is defined by the equation (see Section 102, p. 399) : 


ft* = 


1 


(TxV‘2 

For the distribution of means we have 

l v77 l 


llM 


<r M V 2 a V2 a 


N 

2 


< 8 ) 


as the index of precision of the mean. It will be observed from (6), 
(7), and (8) that the reliability of the mean or the precision of the 
mean varies as the square root of the number in the sample. That 
is, the greater the number in the sample, the greater the reliability 
in the mean. For example, to double the reliability, we must quad- 
ruple the frequency. 

It is not customary, however, in elementary statistics, to use 
h M as the measure of the reliability of the mean. Rather do the 
workers in applied statistics prefer c t m or E M . In fact, it is the custom 



434 


THE THEORY OF SAMPLING 


(see Section 37, p. 143) to write the probable error of the mean im- 
mediately after the computed mean of the sample with a ± sign 
between them, thus: 

M u = M ± E m (9) 

For example, suppose a sample distribution of the heights of 
1,000 men shows an arithmetic mean of 67.5 inches and a standard 
deviation of 2.5 inches. Then: 

Em = 0.6745 jLL= = 0.053 inch 

V1000 

and 

M u = 67.5 ± 0.053 inches 

Since the distribution of sample means collected from a normal parent 
population is itself normal, this means simply that if a large number 
of the means of samples of the heights of 1,000 men were collected, 
half of the sample means would be within 0.053 inch of the mean 
of the universe M u (— M m )- Since M m ± 3 <r M or M m ± 4.5 E M in- 
cludes nearly all the sample means, it is practically certain that no 
sample mean will differ from the mean of the universe M u by more 
than ± 4.5(0.053) inches. 

It should be emphasized that the expression M u = M db E M is 
not to be interpreted as stating that the true mean of the universe is 
somewhere between M — E M and M + E M ; nor is it to be inter- 
preted as stating that the true mean probably differs from the com- 
puted sample mean by the amount E M . It means that, so far as 
variation due to pure chance is concerned , the odds are even that a 
sample mean M will not differ from the mean of the universe M u by 
more than Em- 

11 we were to write the arithmetic mean of the universe M u in the 
form 

M u “ M dr (X m 

this would signify that the odds are about 2 to 1 that a sample mean 
M will not differ from M u by more than a M - It does not mean that 
the odds are 2 to 1 that M u is within the interval whose end values 
are M — a M and M + cr M - The probability pertains to the limits 
of the range embracing M u - We do not state the probability of M u 
lying within these limits for M u is fixed. Thus, for the heights of 
the sample of 1,000 men noted above we have 



THE DISTRIBUTION OF MEANS 


435 


2.5 __ 2.5 

VlfiOO 31.623 


0.08 inch 


And we say that the odds are 2 to 1 that the mean of the sample, 
67.5 inches, is within 0.08 inches of the mean of the universe. The 
odds are 95 to 5 (or 19 to 1) that the mean of the sample does not 
differ from the mean of the universe by more than =fc 1.96(.08) inches, 
and the odds are 99 to 1 that the sample mean does not differ from 
the mean of the universe by more than ± 2.56(.08) inches. 

It has thus become customary to write M u in two different forms: 
M u = M ± E m and M u = M ± In the first case we have 
“Af with a probable error of E M ” and in the second case we have 
“ M with a standard error of a M ” To avoid ambiguity the statistician 
should state definitely what his symbols mean. 


ILLUSTRATIVE EXAMPLES 

Example 1. A corporation which sells a large number of automobile 
tires gathered data on the mileage obtained from a given type of tire. 
A large group of 100,000 users were questioned, and the data analyzed. 
For this universe of S — 100,000 it was found that M u = 21,000 miles 
and <7 U = 2,000 miles. 

At a later time in order to compare the quality of the product, the 
corporation secured data from 1,000 users of the same type of tire. For 
this sample of N - 1,000 it was found that M = 20,960 and a — 1,980 
miles. 

Was the corporation correct in concluding that the quality of the tire 
was not impaired, or that the variation of M from M u was not significant? 

Solution. Translating formula (4) into better symbols, we have 



S - N 
N(S - 1) 


- 2,000 




100,000 - 1,000 


000 ( 100,000 - 1 ) 


= 62.6 miles = 63 miles (rounded). 


Thus for the distribution of means, which is normal, we have Mm = ilf« 
= 21,000 miles and <t m — 63 miles. 

If many such samples were taken we could expect 68.27 per cent, or about 
two-thirds, of the means to fall within the interval M M ± <tm- That is, 
we should expect about two-thirds of the sample means to fall in the in- 
terval 21,000 ± 63 miles, or between 20,937 and 21,063. Since 20,960 is 
within this interval, we conclude that the quality of the tire is not impaired 
and that the difference is not statistically significant. 



436 


THE THEORY OF SAMPLING 



We can look at the problem from another point of view. We express 
the divergence M — M m in standard units. We find 

m M - M m 20,960 - 21,000 40 

*m (Tm ~~ 63 "63 ^ 

Looking up the probability table we find 

f = -5000 - = -5000 - .2357 = .2643 

We would therefore expect 26 per cent of the sample means to be less 
than 20,960 miles and 74 per cent to be greater than 20,960 miles. In 
other words the probability of a sample mean being less in value than 
20,960 is 26/100 or 13/50 and greater than 20,960 is 74/100 or 37/50. 

Of course we can base our argument on the probable error of the mean 
instead of the standard error of the mean. We find 

E m = .6745 a M = .6745(62.6) - 42 miles 

Then we can state that the chances are even that a sample mean will 
lie in the interval 21,000 ± 42 or between 20,958 and 21,042. The given 
mean 20,960 is within this interval, and such a small divergence as 40/42 
probable errors from M m is certainly within the tolerance limits of the 
most scrupulous. 

Example 2. Suppose in the previous example we use formula (5) 

<Tu 

CM = Vn 


Will our results be affected? 



ILLUSTRATIVE EXAMPLES 


437 


Solution. This means, writing formula (4) in the form 



S is so large compared to N that we may consider — and ~ as negligible. 

o o 

We find for the data at hand 


<?m = 


2,000 2,000 

“ 31.63 


= 63.2 = 63 miles (rounded) 


Vi,ooo 

This approximation certainly will in no way alter our previous con- 
clusion. 


Example 3. If in Example 1 we use formula (6) for computing 
will our conclusion be altered? 

Solution : 


cr _ 1,980 

Vn ~ vp 


1,980 

31.63 


= 62.6 = 63 miles (rounded) 


And this approximation will also in no way alter our conclusion. 


Example 4. The blood pressure of 10,000 young men of given age was 
measured and recorded. The analysis of the sample gave M = 122 and 
= 9. Find the standard error and the probable error of the mean, and 
interpret them. What is the 5 per cent level of significance? 

Solution. In this case we do not know the statistics of the universe. 
Our information about the statistics of the universe must be inferred 
from the statistics of the sample. The mean of the sample is an estimate 
of the mean of the universe. How reliable is the estimate? 

We compute the dispersion of the sample means by (6). We have 


a 




- 7 = 0.09 

V] 0,000 


which indicates the dispersion of the sample means about the universe 
mean M u - The universe mean is unknown. However, we can state that 
the odds are 2 to 1 that the sample mean 122 does not differ numerically 
from M u by more than 0.09. Since 99.74 per cent of the sample means 
vary from M u by not more than ± 3 g m {~ 0.27), we may conclude that 
the odds are 99.74 to 0.26, or about 385 to 1, that the sample mean 122 
does not differ numerically from Mu by more than 0.27. 


E m = 0.6745<t m = .6745(0.09) = 0.06 


which indicates that the chances are even that the sample mean 122 does 
not differ numerically from M u by more than 0.06. 



438 


THE THEORY OF SAMPLING 


The 5 per cent level of significance is at ± 1 . 960 ^ or at =t 1.96(.09) 
= =fc 0.176. That is, the odds are 95 to 5, or 19 to 1, that the sample 
mean 122 does not differ numerically from M u by more than 0.176. In 
other words, if many samples of 10,000 were measured and their means 
computed, we should expect 95 per cent of the means to lie within the 
interval Af„ — 0.176 and M u + 0.176. 


EXERCISES 

1. Professor Sorenson 1 records an experiment in which fifty samples 
of fifty men each were taken from American Men of Science and the mean 
age of each sample computed. The means ranged in value from 41.40 
years to 51.14 years. He found that this distribution of means was normal 
with M m — 46.34 years and a M = 2.30 years. 

a. What is the estimated mean age of the 2,500 men? Dr. Sorenson 
gives 46.34 years as the computed mean age of the 2,500 men. 

b. How many of the means would you expect to find between M m — &m 
and M m + 0m? Dr. Sorenson found 34, or 68 per cent of them. 

c. Compute E M . What does it mean? 

d. How many of the means would you expect to find between Mm 
— E m and M m + E M 1 Dr. Sorenson found 25, or 50 per cent of them. 

e. Consider the 2,500 ages as a sample of all the men, 22,000, whose 
names appeared in the book. For this large sample Dr. Sorenson gives 
M = 46.34 and a = 12.46. Compute E M and interpret it with regard 
to the average age of ail men in the book. 

2. A sample of N = 625 gave E M — 0.27. What size sample would be 
required to give E M = 0.09? 0.045? 

3. A distribution of the weights at birth of a sample of 402 infants 
gave M = 7.29 pounds and a = 1.006 pounds. Compute E M and inter- 
pret it. 

4. A study of “ red blood cell count ” for 40 normal men gave M = 4.973 
millions per cu. mm. and a = 0.332 millions per cu. mm. Find <r M and 
interpret it. 

5. For a group of 1,000 college students the mean height was 68.2 
inches and the standard deviation was 2.5 inches, (a) Find the probability 
that in a sample of 100, the mean height will be between 67.82 and 68.78 
inches, (b) Find the probability that in a sample of 100, the mean will 
be greater than 68.9 inches. 

6. Consider the table of the weights of men found on page 140. Does 
the difference between each sample mean and the universe mean lie within 
the 5 per cent level of significance? 

7. Consider the table of the heights of men found on page 141. Does 

1 Herbert Sorenson: Statistics for Students of Psychology and Education f 

one onn 



ILLUSTRATIVE EXAMPLES 439 

the difference between each sample mean and the universe mean lie 
within the 5 per cent level of significance? 

8. Using the probable error notation, E. S. Pearson gave the mean 
length of cubit for 1,063 British males as 18.31 ± 0.019 inches. Show 
that o — 0.92 inch. Does this mean that, assuming a normal distribution, 
about 709 of these men had cubit lengths between 17.39 and 19.23 inches? 

9. If the statement is made that the mean height of 1,000 men is 
68.78 =fc 0.046 inches, can you adduce evidence that 0.046 is E M and not o M ^ 

10. ( Freeman ) Two engineers made 1,306 readings during a 5-year 
period on the heat value in Btu. of a mixed gas. The distribution, ap- 
proximately normal, gave M u = 534.99 Btu. and o u = 3.85 Btu. On 
64 days at irregular intervals, state inspection was conducted and the 
mean of the approximately normal sample was 536.72 Btu. Would you 
say that the 64 measures constituted a random sample? 

11 . The breaking strength of a certain type of cord has been established 
from considerable experience to be 18.3 ounces with a standard deviation 
of 1.2 ounces. A sample of 100 pieces of the same type of cord shows a 
mean breaking strength of 16.5 ounces. Would you say that the sample 
is inferior? 

12. After observing a large number of cases it has been established that 
a certain disease is 10 per cent fatal. The hospital of the Good Shepherd 
found that during the period 1937-1942, of 100 patients admitted with the 
disease 12 died. May this difference be attributed to chance? Hint: 

<r„ = Vpq/N. 

13. At Bucknell University the freshmen who take College Algebra 
are previously screened by a placement test. Our records covering a 
period of years reveal that about 16 per cent fail the course. During the 
fall semester, 1942, of 400 freshmen enrolled in College Algebra 20 per cent 
failed. Adopting the 5 per cent level as a basis for judgment, would 
you say this difference is significant? 

D. The Skewness and Excess of the Distribution of Means. We 

have derived formulas for the arithmetic mean and the standard 
deviation of the distribution of sample means given by (1). In order 
to characterize more completely the distribution, we should derive 
formulas for the skewness and the excess. In this section we shall 
give an abridged derivation for the skewness, leaving the details 
for the reader to work out, and shall give without proof the formulas 
for the excess. (See Exercise 9 at end of this chapter.) 

The skewness for the distribution of means is given by: 



440 


THE THEORY OF SAMPLING 


where 


2Z 3 


V»,z — 


sCn 


3 2P 

sCn 


• M z + 2M J 


( 10 ) 


Returning to equations (1) we have: 

2Z 3 


sCn 


iv 2 ,s[ sx * 


, 3(tf-l) vvsr . 6(iV-l)(i\r-2) vy 


OS-1) 


GS-1)0S — 2) 


( 11 ) 


Since 

and 

and 

we have: 


(2X) 2 = 2X 2 + 22 XiXi 

2X2X 2 = 2X 3 + 2X*«X, 

22X2 XiXj = 22X;X,- + 62X,X,X* 

2X 2 ,X J - = 2X 2 2X - 2X 3 
62X,X,X* = (2X) 3 - 32X 2 2X + 22X 3 


Substituting these values in (11) we obtain: 


2Z 3 

sCn 


1 [(5 - N)(S - 2 N) 2X 3 
N*l {8 - 1)(S - 2) S + 


3S(S - N)(N - 1)2X 2 
OS - 1)0S - 2) .S A 
<S 2 (X - 1 KN - 2) 

/ n i \ / cy r%\ iri v 


(S - 1)0S - 2) 


r 3 

X 


Substituting this and the other necessary values, previously found, 
into (10) we have: 


^ 3 , Z 


(S - N)(S - 2 N) T2X 3 32X 2 3 1 

" X 2 (S- 1)(S - 2)1 S S~ Mx + mx ] 


(S - N)(S - 2 N) 

Vi,z- N 2 (s _ i)(s - 2) V *' x 

and hence 


If S is infinite: 


S - 2AT. / 

a3 ^ = -s^2"V: 


N(S - N) 


“s. / - t/x ' a ». x 


a *. X 


( 12 ) 


(13) 


Further, if the parent population is normal, 1 a 3tX = 0; hence 
'otz,z = 0. Therefore the skewness of the distribution of sample means 
chosen from a parent normal distribution is zero . 

1 See p. 405. 



ILLUSTRATIVE EXAMPLES 


441 


By a similar procedure, but with much more laborious algebra, it 
may be shown that for the distribution of Sample means given by 
(1) the excess is given by the formula: 

<Xi,z *— 3 = 

(S - 1 )(S 2 + S-6NS + 6 N 2 ) f „ 6 S(N - 1 )(S - N - 1) 

N(S - N)(S - 2 )(S - 3 ) LQU ’* J iV(S - N){S - 2)(S - 3 ) 

If S is infinite: 

a *.z — 3 = ^[a 4 , * — 3] 

If the parent population is normal, a i X = 3, in which case 
a 4 z = 3. Therefore, we may say that the excess of the distribution 
of sample means chosen from a normal parent population is zero . 

In the text we have stated that the distribution of means is a 
normal distribution. It has long been known, probably since the 
time of Gauss, that if random samples are taken from a universe 
distributed normally , the means of the samples also form a normal 
distribution. If the universe is non-normal, not a great deal is 
known at present, from analytic considerations, about the distribu- 
tions of statistics of samples. However, even for small values of N , 
there is sufficient experimental evidence to support the conclusion 
that the distribution of means of samples selected randomly from any 
finite universe is practically normal. 

We have shown that if S is unlimited and N is large, a 2t M ~ B 
a s, m = 0) # 4 , m = 3. By a continuation of this same method, 1 under 
the stated hypotheses, it is easy to show that a 5t M = 0, a Qt M 
= 1 • 3 • 5 = 15, a 7% 0, « 8> M = 1 • 3 • 5 • 7 = 105, and so on. 

That is to say, if fairly large samples are taken from an infinite 
universe, the moments of the distribution of means are those of a 
normal curve. Further, it is not difficult to show that if the parent 
universe is infinite and distributed according to the Pearson Type III 
curve, the moments of the distribution of sample means are also of 
the Pearson Type III curve. However, it is well known that as N 
increases the Type III curve approaches the normal, so again we have 
the property that as N grows large, the curve of means approaches 
normality. 

1 Richardson, C. H., The Statistics of Sampling , published by Edwards Brothers, 
Ann Arbor, Michigan. 



442 


THE THEORY OF SAMPLING 


We may say then that, so far as the practical needs are concerned, 
the distribution of meats has been rather thoroughly explored. We 
regret that we cannot say so much for the distribution of sample 
standard deviations. If the universe is normal the curve of the 
standard deviations of samples is Type III. If the universe is non- 
normal, we do not know the distribution function of <r, not even the 
values of the moments. However, by working through the moments of 
the variance ( = a 2 ) we arrive at the facts contained in the next section. 

112. THE RELIABILITY OF THE STANDARD DEVIATION 

In Section 110 (p. 425) we outlined an experiment that was intended 
to explain to the reader what is meant by a sample mean and a 
sample standard deviation. Each sample drawn has its mean, its 
standard deviation, et cetera. In order to introduce the reader to the 
problem of sampling, we have shown in considerable detail in the 
preceding section how we may characterize the distribution of sample 
means. We were especially interested, however, in finding measures 
of the reliability of the mean, which measures we found in g m and 
E m (= .6745 cr M ). 

The sample standard deviations in like manner form a distribution 
that may be characterized by its mean, M a (the mean of the standard 
deviations), its standard deviation, G a (the standard deviation of the 
standard deviations), and so on. We are especially interested in 
G a or E a , by which we measure the variability and the reliability of 
any sample standard deviation. 

The algebraic development showing the derivation of M a and G ff 
would take us too far afield. It can be shown that if the parent 
population is normal and N is large , the mean of the distribution of 
standard deviations is approximately equal to the standard deviation 
of the parent population , and the standard deviation of the distribution 
of standard deviations is approximately equal to the standard deviation of 
the parent population divided by the square root of twice the number 
of variates in the sample . 1 That is: 

M a = g u 

= <7u 

* V2N 

1 See formula (24) of Section 114. 



RELIABILITY OF STANDARD DEVIATION 


443 


Since the standard deviation of the parent population, a u , is ap- 
proximately equal to the standard deviation of the sample, <r, we 
have: 

Me = a approximately 


and 


ov = a PP roximate ly 

(14) 

E„ = 0.6746ov = 0.6746-^= 

V 2N 

(15) 


The meaning of E* is similar to that given for E M . Thus, it is 
customary to write the standard deviation in the form 

a u = a zt E ff 

which means that, assuming that the curve of sample standard 
deviations is approximately normal, half the sample standard 
deviations lie within the range whose end values are a u — E a and 
cr u + E a . It also means that the chances are even that the sample 
( 7 does not differ from cr w by more than ± E a , and it is practically 
certain that the sample a does not differ from cr w by more than 
± 4 . 5 (Ea). 


113. THE RELIABILITY OF THE DIFFERENCE 
BETWEEN TWO MEASURES 

An important problem in applied statistics is the determination 
of some criterion that will assist one in judging whether an observed 
difference between two samples is apparent or real. That is, is the 
difference between two samples such that it might arise from sampling 
(that is, from pure chance), or is the difference significant of a greater 
variation in the two samples than can be explained by random 
sampling alone? 

Suppose we select from a normal parent population two samples, 
each fairly large. Each sample has its mean, its standard deviation, 
et cetera. The two means will not likely be equal and hence we shall 
have a difference of two means. Also, the standard deviations will 
not likely be equal and hence we shall have a difference of two stand- 
ard deviations. Continue this process until we have, say, m pairs of 
samples, m usually a large number, and hence m differences in means 



444 


THE THEORY OF SAMPLING 


that will constitute a distribution of differences in sample means. 
From these m pairs of samples we may also h&ve m differences in 
standard deviations that will constitute a distribution of differences 
of sample standard deviations. 

Let Xi and F* be used to distinguish corresponding characteristics 
— means, standard deviations, et cetera — of two groups when the 
ith pair of samples has been taken, and Xi — F* be the difference 
in any pair of corresponding characteristics. 


Table 99 


Sample 

Pair 

Group I 
X 

— i 

Group II 

Y 

Difference 

X — Y 

1 

Xx 

Ft 

X , - Yi 

2 

Xi 

Y, 

X 2 - Yi 

t 

x ( 

Yi 

Xi - Yi 

m 

x m 

Y m 

X m - Y m 


We shall find the arithmetic mean and the standard deviation for 
the distribution of differences: 


„„ 2 (Xi - Yi) 2 Xi 2 Yi njr 

M x ~y = — = — — = M x - M y (16) 

TTi mm 

K-y = Z(X ~ — - \ ^ (X ~ 7) ] 2 (by ( 7 ), chapter 4) 

= 2 /2XV + 2P_ /2 YV _ 2 p XY _ sxsr I 
m \m ) m \m / [_ m mm J 

Using (7) on p. 128, and (7) on p. 245, we have: 

a X-r - + °Y - 2r XY^X a Y ( 17 ) 

If the two distributions, X and Y, are independent so that r XY 
is zero, then: 


G X— Y V ^ X + O' y 


(18) 



DIFFERENCE BETWEEN TWO MEASURES 


445 


We are especially interested in (18) when X and Y are the corre- 
sponding means of two samples. Then 

a M x -My = ^ <J M X + a My ( 19 ) 


gives a measure of the variability (or the reliability) of the differences 
in two sample means. Also 

a <r x -v Y - V^f x + of Y ( 20 ) 

gives a measure of the variability and the reliability of the differences 
in two sample standard deviations. The formulas for the correspond- 
ing probable errors are found by multiplying (19) and (20) by 
0.6745. Thus: 

Bh x -m r = 0.6746 V<ri x + (21) 

K x -' Y = 0.6746 + cl r (22) 


Let us consider, for illustration, the results on the placement 
examination in mathematics of two different freshman classes at 
Bucknell University. 


Group I 
N x = 329 
M x = 32.75 
(X x = 8.05 


Group II 
Ny = 302 
My = 30.60 
(T y — 6.95 


The difference between the two means is 32.75 — 30.60 = 2.15. 
Is this difference so large that it could not be due to chance or does 
it indicate that Group I really demonstrated a significantly better 
training in elementary mathematics? 

There is no question about the observed difference in the two 
means. It is certainly 2.15. Could such a difference be due to 
chance? Yes, such a difference could be due to chance but we shall 
show that the likelihood that it did arise from chance is so small 
that we feel justified in neglecting it and in assuming that the dif- 
ference has been caused by other factors than pure chance. When 
such is the case, the statistician says “the difference is significant.” 



446 


THE THEORY OF SAMPLING 


Using (6), (19), and (21): 

<tm x ~ 0.444 <tm x -m y = 0.597 

(Tm y — 0.399 Em x -m y = 0.403 

We may now state that the chances are even that an observed 
(M x — M Y ) is within ± 0.403 of the true (unknown) value, and it is 
practically certain that an observed ( M x — M y ) is within db 3(0.597) 
or =fc 4.5(0.403) of the true value. Following custom, we describe 
this variation by writing 2.15 ± 0.403 which, translated into English, 
reads “2.15 with a probable error 0.403.” It may be noticed in- 
cidentally that the difference 2.15 is 3.6 times its standard error and 
5.3 times its probable error. Such a large numerical difference as 
this would rarely occur by pure chance, in fact, about 4 times in 
10,000. When the happening of an event, such as this under dis- 
cussion, is extremely unlikely, we conclude that some factors other 
than pure chance have influenced the result. 

While proofs for all the statements are beyond the scope of this 
text, other pertinent facts are the following. If many independent 
sample pairs are taken from normal parent populations, the differ- 
ences (indicated by D ) of means, standard deviations, etc. also form 
approximately normal distributions. As may be expected, the mean 
of the distribution of differences, M D , is zero and the standard devi- 
ation, a D) is given by (18). The probable error of the D distribution 
is of course E D = 0.6745<7£>. It is customary to take 


M = * or 


M 

E d 


= k 


as the criteria whereby one can quickly determine if the difference 
D is significant. As a “rule of thumb” we say: 

if t > 3, (or if k > 4.5), the difference is certainly significant; 

if t > 2, (or if k > 3), the difference is possibly significant; 

if t < 2, (or if k < 3), the difference is probably not significant. 

These limits, however, are arbitrary, and consequently vary among 
the authorities. 

In the particular problem of this section: 



DIFFERENCE BETWEEN TWO MEASURES 


447 


Z) = M x - M y = 2.15 
<td <tm x -m y 0.597 


Hence, from a comparison of the means, we would conclude that 
Group I and Group II came from statistically different parent popula- 
tions; or, if from the same parent population, then other factors than 
pure chance must have caused a numerical difference as large as 2.15. 

The assumptions underlying this procedure deserve a brief con- 
sideration. The universe difference in the means, or other statistics, 
is assumed to be zero. Is this a reasonable assumption? We think it 
is. Let the reader return to Table 99, and remember that each 
sample, fairly large, is drawn from a normal parent universe. It 
would seem then that of the m differences of (X< — Y t ), negative 
differences would occur about as frequently as positive differences 
and of equal numerical amounts so that their sum 2(X» — Y t ) would 
theoretically equal zero. Hence, theoretically M x = M v . 

It. A. Fisher terms such an hypothesis a “ null hypothesis the 
hypothesis that there is no difference. So in our applications we 
try to give the facts a chance to nullify the hypothesis. We make 
no effort to prove it or to disprove it; rather do we attempt to cast 
doubt upon it. 

In our illustrative example we sought evidence that the two samples 
came from different universes. Very well, on the basis of large 
sample theory, we began by assuming they came from the same 
universe with M D = 0 and cr D = .597. It is expected that practically 
all of the actual differences will fall within 0 db Scr D . If, therefore, 
the actual difference D exceeds 3a D numerically, then it is reasonable 
to conclude that our assumption of the same universe is probably 
wrong. Thus we conclude that the two samples came from different 
universes . 

Of course we may wish to see what light a comparison of the 
variabilities of the samples will throw upon our problem. We find, 
using (14) and (20), 


<T<j 


x 


8.05 

V2(329) 


0.314 


0*<r 


6.95 

V2(302) 


0.283 


0* X -* Y - V (0.314) 2 + (0.283) 2 - 0.423 



448 


THE THEORY OF SAMPLING 


Z> ^ 8.05 - 6.95 
( td 0.423 


2.6 


So a comparison of the standard deviations supports the previous 
conclusion since t > 2. 

As this problem well illustrates, in investigating the significance 
in differences it is a wise procedure to penetrate the problem as 
deeply as possible. 

Example. Two samples of weights of male students gave the following in- 
formation: N\ = 100, M\ = 140.4 lbs., <j\ = 17.7 lbs.; N 2 = 100, M 2 
— 136.8 lbs., <r 2 = 16.2 lbs. If other samples are taken, what is the 
probability that an observed difference in the means will be numerically 
equal to or greater than D = 140.4 — 136.8 = 3.6 lbs.? 

Solution. 

17 - 7 , 16.2 
ffMl V 100 7 CM2 VToo 

a D = V(1.77) 2 + (1.62)* = 2.4 


P = 2[A <a ] 1 0O 6 - 2(0.5000 - 0.4332) = 0.1336 

That is, we would obtain a difference numerically as large as 3.6 about 
134 times in 1,000. 


Figure 61 




DIFFERENCE BETWEEN TWO MEASURES 


449 


EXERCISES 

1. The following table gives the distribution of the weights of 1,000 female 
students subdivided into ten random samples, each of 100 individuals. 
Measurements were recorded to nearest l/10th pound. 



Let the universe be the “total” group. 

a. Compute M and cr for each sample and for the total. 

b. Compare the mean of the ten sample means with M u , 

c. Compare the mean of the ten sample <r’s with <r„. 

d. Using how many of the ten sample means are within 

Vioo 

the five per cent level of significance? 

e. Using <r a = — ~= > how many of the ten sample a ’ s are within the 

V200 

five per cent level of significance? 

f. Do you believe that randomness went awry on any sample? 

2. ( Tippett , p. 70) The lengths of 4,000 hairs of an Indian cotton gave 
M = 2.33 cm. and cr =* 0.4806 cm. “ The first thousand hairs were selected 



450 


THE THEORY OF SAMPLING 


by a different method from the rest and gave a mean of 2.54 cm. Is this 
deviation compatible with the hypothesis that the 1,000 are a random 
sample from the 4,000 and that the difference in means is due to random 
errors, or is the difference large enough to indicate that the change in 
technique has had an effect?” 

3. A contractor purchased a certain type of copper sheeting from a 
manufacturer. The contract specified that the sheets were to meet a 
theoretical standard — universe mean — of thickness 0.022 inch. The 
contractor measured a sample of 100 sheets and found M = 0.020 inch 
and a = 0.003 inch. Did the contractor have reason to complain? 

4. For ten years we at Bucknell University have given to the in-coming 
freshmen a standardized test in pre-college mathematics. Based upon 
this experience with S = 4,000 we have established the norms for the test: 
M u = 62, cr u — 18. The freshman class of 400 admitted in September 
1939, Class of 1943, took the test with the results: M ~ 58 and a = 16. 
Would you agree that the Class of 1943 was significantly ill-prepared in 
mathematics? The Class of 1945 with N = 400 took the test with the 
results: M = 60.5 and a — 16.5. Is the Class of 1945 within the five per 
cent level? 

6. During a given month one machine produced 900 units but spoiled 
3.2 per cent of them. During the same month another machine with a 
more experienced operator produced 1,000 units but spoiled 2.8 per cent 
of them. Is the percentage difference in spoilage significant? 

6. A. S. Parkes and J. C. Drummond ( Proc . Roy. Soc., B, XCVIII, 
p. 147) gave the following data showing the effect of vitamin B on the 
sex-ratio of offspring in rats. May the percentage change in males be 
attributed to chance, or is the evidence sufficient to warrant that the 
change was due to the increased vitamin B? 


Diet 

Males 

Females 

Total Young 

Per cent Males 

Vitamin B Deficient 

123 

153 

276 

44.57 

Vitamin B Sufficient 

145 

150 

295 

49.15 

Totals 

268 

303 

571 



114. SMALL SAMPLES 

The formulas for estimating the reliability of a statistic that we 
have given previously are suitable when N is reasonably large, say 
30 or more, but require modification when N is small. When N is 
small, the o' of a sample which is used as an estimate of <j u gives 
values too small and thus our standard errors have a downward bias. 
To overcome this bias we need to develop a theory that will give 



SMALL SAMPLES 


451 


us a better estimate of the standard deviation of the universe <r u 
than is given by cr. We shall now attack the problem of finding the 
standard deviation which gives the better estimate for a u . 

Consider the parent universe X h X 2 , . . X s . Transform the 
S variates to the mean of the universe M u as origin and denote them 

by x h x 2y . . . , x s where Xi — X t - — M u . We then have for the 

s 

universe Ea;; = 0. 

i=i 

From this universe we choose samples of N . In all we may choose 
S C N samples. Each sample has its second moment and thus in all 
we have S C N second moments. These S^N second moments give us 
a distribution of sample second moments. It is our immediate 
problem to find the mean of these S C N sample second moments. 

Let 

m 2 , k = the second moment of the fcth sample about the mean 
of the sample 

Then, for the fcth sample, we have 

Ex 2 /SaA 2 Sz 2 (Ea;) 2 

= ~N ~ VW) = ~N ~ 

Since (2x) 2 = 2x 2 + 2hx,Xj, we have 

mt fk = - l)2x 2 — 22 xaj] 

where i j, and the 2’s cover only the sample. 

The mean of the distribution of second moments is given by 

" f-[ (W - ‘>1 Zxgi ] 

*-i 

where the 2’s cover the entire universe. 

Again returning to 

(2x) 2 = 2x 2 + 22 XiX it 

we note that for the universe 2x = 0, and hence 


— 22x,x,- = 2x 2 = Sal 



452 


THE THEORY OF SAMPLING 


Then, substituting and simplifying, 


,, ,, S(N - 1) 2 

M a i <T U 


If S becomes infinite, 


M?i = 


N - 1 
N 


(23) 

(24) 


That is, if the parent population is very large, the expected Hi 
or a 2 is (AT — 1 )/N times the parent ju 2| „ or cr,;. 

If we replace in (24) Mat by cr 2 sa m P u or <r 2 , and cr u by (T u , estimated OF 
<r u ,est., where is Z/ie best estimate of the standard deviation of 

the universe from the sample, (what R. A. Fisher calls the maximum 
likelihood estimation of a u from a sample) we have 


O’u, eat. 



(25) 


If, as is customary, we find cr for a sample of N items by the 
formula 


we obtain 




(26) 


(27) 


Consequently, if we must estimate <r u from a sample, formula (27) 
gives a better estimate than the customary one (26). Of course 
if N is large, it is a matter of little consequence whether we divide 
by N or by (N — 1), but when N is small, say less than 30, the use 
of (AT — 1) is particularly important. 

We immediately find, for N small, 1 


1 The introduction of the factor \/ jy ZL ‘~\ * n @5) 18 called “Bessel's correc- 
tion," and the formula for the standard error of the mean 1 


v 




“Bessel's formula" [Friedrich Wilhelm Bessel (1784-1846)]. 


N(N - 1) 


is called 



SMALL SAMPLES 


453 


aM vn - 1 


( 28 ) 


a<T ~ V2 {N - 1) 


(29) 


where a is computed from (26). Corresponding formulas for E u 
and E a are immediately found. 

We may thus compute standard and probable errors for the various 
statistics, mean, standard deviation, differences, and so on when N 
is small by using formulas (27), (28), (29), and their substitutions 
in (21) and (22). A word of caution is in order with regard to apply- 
ing them to establish probability levels. 

In our previous discussion we have used the values of the normal 
curve to assist in interpreting the values of 

M — M u a — a u M\ — M 2 

, , 

a M a (T 


because the distributions of these quantities are closely normal when 
N is large. When N is small, these distributions deviate from 
normality, the amount of the deviation increasing as N decreases. 
A special table has been devised by R. A. Fisher which gives values 
of t for various 11 degrees of freedom ” n , (n = N — 1 in the above 
formulas) and various probabilities P that an observed value may 
differ from zero by more than ± t. Or it gives values of t for given 
levels of significance and given values of n. 

This table differs considerably from that of the normal curve. 
For example, in the normal curve with N = 11 or n = 10, the 1 per 
cent level of significance is at t = ± 2.58 whereas in the Fisher table 
the value of t is t = ±3.17. When N is larger than 20, the differences 
are not so appreciable, and when N is greater than 30 the normal 
table may be used with slight error. This Fisher table is found in 
the texts by Fisher and by Croxton and Cowden listed in the Ap- 
pendix. A general idea of the table may be obtained from the 
portion that we reproduce on page 454. 

In the use of this table remember that a “level of significance ,, 
refers to both tails of the distribution. Note too that it is set up 
differently from the table of areas for the normal curve. A tail 
of the normal curve is found by subtracting the tabulated value from 
0.5000, and doubling this value yields the level of significance. 



454 


THE THEORY OF SAMPLING 


Table 100. Values of t for Degrees of Freedom n and Levels 
of Significance P 


Level of Significance 

\ P 

n 

.9 

.7 

.5 

.3 

.1 

.05 

.01 

4 

.134 

.414 

.741 

1.190 

2.132 

2.776 

4.604 

5 

.132 

.408 

.727 

1.156 

2.015 

2.571 

4.032 

6 

.131 

.404 

.718 

1.134 

1.943 

2.447 

3.707 

8 

.130 

.399 

.711 

1.108 

1.860 

2.306 

3.355 

10 

.129 

.397 

.706 

1.093 

1.812 

2.228 

3.169 

15 

.128 

.393 

.691 

1.074 

1.753 

2.131 

2.947 

20 

.127 

.391 

.687 

1.064 

1.725 

2.086 

2.845 

30 

.127 

.389 

.683 

1.055 

1.697 

2.042 

2.750 

00 

.1257 

.3853 

.6745 

1.0364 

1.6449 

1.9600 

2.5758 


Table 100, however, shows n (degrees of freedom) in the stub, P 
(the level of significance) in the caption, and t in the body of the 
table. The last line of the table for n = oo shows values of t obtained 
from the normal curve. 


Exercise: Show that a u , eat. may be found from 


<Tu, eat. 


s/ 


N2X 2 - (2X) 
N(N - 1) 


2 


(30) 


Illustrative Example 1 . A corporation has set as a standard the 
mean breaking strength of a certain type of wire at 582 pounds. 
A sample of 10 specimens was tested with the results shown in 


Table 101. Breaking Strength of Wire 


Specimen 

Breaking Strength 
( pounds ) 

X 

X 

X 2 

1 

581 

2 

4 

2 

576 

- 3 

9 

3 

584 

5 

25 

4 

586 

7 

49 

5 

575 

- 4 

16 

6 

573 

- 6 

36 

7 

574 

- 5 

25 

8 

572 

- 7 

49 

9 

588 

9 

81 

10 

581 

2 

4 

Total 

5790 

0 

298 





SMALL SAMPLES 


455 


Table 10L For the purposes for which the wire is used, values 
within the 5 per cent level are tolerated. Does this sample meet the 
requirements? 


Solution: 
We have 


M = 


5790 

10 


579 pounds. 



5.46 pounds. 


O'M = 


5.46 

V9 


= 1.82 pounds. 


In the t table for n = N -* 1 = 9, we have at the 5 per cent level, 
t — ± 2.3. That is, a variation of ± 2.3(1.82) or ± 4.14 pounds on 
either side of 582 pounds is tolerated. Hence the toleration limits 
are (582 =L 4.14) pounds or from 577.76 pounds to 586.76 pounds. 
Certainly 579 pounds, the mean of the sample, is well within these 
limits. 

Illustrative Example 2. Table 102 gives data on strength tests 
(lbs. per sq. in.) on two types of wool fabric. Is the difference in 
the means sufficient to warrant the conclusion that Type 2 is superior 
to Type 1? 


Table 102 


Type 1 

Type 2 

Specimen 

Strength 

Specimen 

Strength 

1 

139 

1 

137 

2 

127 

2 

132 

3 

134 

3 

135 

4 

125 

4 

144 

5 

141 

5 

131 

6 

144 

6 

133 

7 

128 

7 

136 

8 

138 

8 

134 

9 

131 

9 

139 

10 

133 

10 

129 


For these data we find 
Ni = 10 

M\ = 134 lbs. per sq. in. 
(Ti = 6.05 lbs. per sq. in. 


N 2 = 10 

M% = 135 lbs. per sq. in. 
<j% = 4.09 lbs. per sq. in. 



456 


THE THEORY OF SAMPLING 


<x M = —;^Z} . = 2.02 lbs. per sq. in. 

Ml VNi - 1 H 

. . = 1.36 lbs. per sq. in. 

VNi - 1 

V(2.02) 2 + (1.36) 2 = 2.4 lbs. per sq. in. 

D_ 1 3 5^ 134 = 

<r D 2.4 ' 

From Table 100, for n = N — 1 = 9 and Z = .417 we estimate P 
at about 0.7, indicating that a difference of d= 1 lb. per sq. in. might 
occur 7 times in 10. There is thus no evidence to support a contention 
that Type 2 is superior to Type 1. 

115. CONCLUDING REMARKS ON SAMPLING 

The statistical theory of sampling is a fundamental and basic 
problem in mathematical statistics. It has challenged and continues 
to challenge some of our best minds. The reader who may wish to 
pursue the problem further will find the following articles interesting 
and not too difficult. 

H. C. Carver, Fundamentals of the Theory of Sampling , Annals of 
Math. Statistics, Vol. I, page 101. 

C. C. Craig, An Application of Thiele’s Semi-invariants to the Sam- 
pling Problem , Metron, Vol. VII, No. 4. 

W. E. Deming and R. T. Birge, Statistical Theory of Errors , The 
Graduate School of U.S. Dept, of Agriculture, Wash., D.C. 
Dunham Jackson, The Theory of Small Samples , Amer. Math. 
Monthly, June-July, 1935. 

C. H. Richardson, The Statistics of Sampling , Edwards Brothers, 
Ann Arbor, Michigan. 

H. L. Rietz, Topics in Sampling Theory , Bulletin of the American 
Mathematical Society, April, 1937. 

W. A. Shewhart, Economic Control of Quality of Manufactured Prod- 
uct, D. Van Nostrand Co., New York City. 

116. SUMMARY OF RELIABILITY FORMULAS 

In this chapter we have undertaken only to introduce the reader to 
what Karl Pearson has called the fundamental problem in statistics, 


ctm % = 

t = 



SUMMARY OF RELIABILITY FORMULAS 


457 


namely, the problem of sampling. To do more in an elementary 
text would not be good judgment on our part. A list of the probable 
errors that are needed most frequently follows and includes a few 
which we are not in a position to derive here. 1 


Statistical Constant 
The arithmetic mean 

The median (normal distribution) 

The standard deviation (normal 
distribution) 


Probable Error 
0.6745 cr 
VN 
0.8454(7 
VN 

0.6745(7 ^ 0.4769cr 
V~2 N " VN 


The coefficient of correlation (nor- 
mal distribution) 

a 3 for a normal distribution 


1 — r 2 
0.6745-^- 


0.6745 



a 4 for a normal distribution 


0.6745 



EXERCISES 

1. For the distribution of scores in English, (a) of Exercise 4, page 102, 
we have found N = 334, M — 149.8, cr = 42.47. Find E M and interpret it. 
Also find (Tv and interpret it. 

2. For the distribution of the lengths of eggs, (a) of Exercise 15, page 
105, we have found N = 450, M = 56.323 mm., o — 2.386 mm. What is 
the probability that the sample mean does not differ from the universe 
mean by more than d= 0.09 mm.? What is the probability that the 
sample dispersion does not differ from the true dispersion of the universe 
by more than ± 0.07 mm.? 

3. Find <x M and && for the distribution of pulse beats, Table 29, page 
165. Find the probability that the sample mean does not differ from the 
universe mean by more than db 1.0 pulse beats per minute. 

4. Assuming normality, find o> and E r for the data of Table 59, and 
interpret them. 

5. Find Em for the data of the chest measurements of men, Exercise 10, 
page 168, and interpret it. 

1 For the probable errors of other constants, see Rietz and others, op. cit . p. 77. 



458 


THE THEORY OF SAMPLING 


6. The following are summaries of the results on placement tests in 
English which were given to two freshman classes entering Bucknell Uni- 
versity. 

Group I Group II 


N = 334 
M = 149.79 
a = 42.47 


N = 302 
M = 158.37 
a = 36.28 


Is the difference between the means significant? 

7. The heights of two groups of soldiers were measured and the follow- 
ing results were secured: 


Group I 

N - 10,000 
M = 67.51 inches 
a = 2.20 inches 


Group II 

N = 10,000 
M = 62.24 inches 
cr = 2.25 inches 


Is the difference in the means sufficient to warrant belief that the two 
groups were chosen from different races? 

8 . We present below two frequency distributions based upon the batting 
averages of players in the National and the American leagues during the 
early part of the 1925 season. (See the accompanying table.) 


Frequency Distribution of Batting Averages 1 


Baltina Average 

Number of Players in 
the National League 
with the Given Average 

Number of Players in 
the American League 
with the Given Average 

.050-.099 

3 

0 

.100-149 

7 

11 

.150-.199 

11 

11 

.200-.249 

21 

22 

.250-.299 

31 

35 

.300-349 

34 

28 

.350-399 

18 

13 

.400- 449 

4 

6 

.450-.499 

0 

0 

•500-.549 

3 

2 

.550-.599 

0 

1 


i 


Is the difference in the means of these distributions significant? 

1 New York Herald Tribune, May 17, 1925. See also F. C. Mills and D. H 
Davenport, Manual of Problems and Tables of Statistics, 1925, p. 65. 



SUMMARY OF RELIABILITY FORMULAS 


459 


9. In the theory of the chapter we have assumed that the parent popu- 
lation consisted of the S variates X h X 2 , . . X s . We proved that 
Mz = M x , where Mz is the mean of the distribution of sample means and 
M x is the mean of the parent population. Let us now transform the S 
variates to this mean as origin and denote them by Xi = Xi — M x , (i — 1, 
2 , 

Let Zi be the zth sample mean of N variates chosen from the population 
x h xi, . . x s - We may have the s Cn distinct sample means which are 
given by the following equations: 

Zl = ^ C Xl + Xl + ' ' ' + %-l + X Al] 

2:2 = V C Xl + X2 + ’ ' ’ :Cjv - 1 + x ^+d 


'sCff — jy [*S-N + 1 + X ,S-iV+2“^ • ■ • + X s _i + xj 


s 

Recalling that Xxi — 0 : 

i » l 

a. Show that: 

b. Show that: 


Zz* = 0 


Zz 2 

sCn 


±r + 

n* |_s * ^ s(s - i) 


Sx<x ; J 


which, upon applying the proper symmetric products, reduces to: 

2 Z 2 S - N 2x5 S — N , 

S C N ~ N(S - 1) S ~ N(S - lf x 


c. Use a. and b. and show that: 


<r t 


WW.-V. 


S - N 
N(S - 1) 


<rx 


d. 

2x> 

sCjv 


Show that: 


N 3 


~N 

S 


Sx? + 


3N(N - 1) 
S(S - 1) 


2x?x, + 


6 N(N - 1 )(N - 2) 
S(S - 1 )(S - 2) 


2 x<x,x* 




which, upon applying the proper symmetric products, reduces to 

22 s (S - N)(S - 2N) 2x? 

S C N N 2 (S - 1)(S - 2) S 



460 THE THEORY OF SAMPLING 

and finally to: 

( S -N)(S- 2 N) 

" N 1(S - 1 )(S - 2) t ' s,x 

10. Distributions of the heights of men born in England and in Scotland 
gave the following results: 

England Scotland 

N = 6,194 N = 1,304 

M = 67.4375 inches M = 68.5456 inches 

< t — 2.548 inches o = 2.480 inches 

Is the difference in the means sufficient to conclude that Scots are really 
taller than Englishmen? 

11. A distribution of 150 people in normal condition gave an average 
pulse rate of 79.68 ± 0.15 beats per minute but after being administered 
a certain drug they showed an average pulse rate of 81.12 ± 0.20 beats 
per minute. Is it probable that the increase in the pulse rate was due to 
the drug, or is the increase simply a result of variation due to sampling? 

12. For the distributions of wages received by clothing workers m 
Cincinnati, Cleveland, and St. Louis we have found the values given in 
the table. Are the differences of the means significant? [See page 75.] 



Cincinnati 

Cleveland 

St. Louis 

M 

$16.77 

$21.48 

$15.90 

o 

6.86 

6.28 

6.04 


13. The average grades of sorority and non-sorority women on a certain 
campus were as follows: 

Sorority Group Non-sorority Group 

N = 175 N = 150 

M = 81.23 M = 79.62 

cr = 10.18 cr = 9.37 

Is the difference of the arithmetic means sufficient to conclude that 
there was a real difference in the scholarship of the two groups? 

14. Desiring to test the milk-producing qualities of two different kinds 
of food, a dairy association separated, by a random selection, 800 cows 
into two different herds. All other conditions were kept identical as far 
as possible. Observing the cows for a certain period, the following results 
were obtained: 




SUMMARY OF RELIABILITY FORMULAS 


461 


Herd Number 1 
Ni = 400 

M\ = 36 quarts per cow 
<7 ! = 5.4 quarts per cow 


Herd Number 2 
N 2 * 400 

M 2 *= 40 quarts per cow 
<r 2 = 4.5 quarts per cow 


Determine whether the difference between the average yields of the two 
herds is or is not significant. 

15. The following table exhibits two frequency distributions relating 
to the earnings of coal miners in two different sections of Illinois. Is the 
difference between their means sufficient to conclude that these two sec- 
tions do not belong to the same homogeneous group? 


Pick Miners in Illinois Coal Mines Classified Accord- 
ing to Average Daily Earnings, 1918-1921 1 


Range of Average 
Daily Earnings 

Number of Pay Checks 

In 21 Central 
Illinois Mines 

In 52 Southern 
Illinois Mines 

$ 2.00- 2.99 

501 

87 

3.00- 3.99 

1,288 

131 

4.00- 4.99 

3,222 

306 

5.00- 5.99 

6,293 

563 

6.00- 6.99 

9,821 

973 

7.00- 7.99 



8.00- 8.99 


2,684 

9.00- 9.99 

9,484 

5,584 

10.00-10.99 

6,748 | 

2,426 

11.00-11.99 

4,418 

1,433 

12.00-12.99 

2,551 

853 

13.00-13.99 

1,304 

577 

14.00-14.99 

696 

364 

15.00-15.99 

362 

197 

16.00-16.99 

196 

105 

17.00-17.99 

115 

71 

18.00-18.99 

57 

35 

19.00-19.99 

39 

33 

20.00-20.99 

25 

13 

21.00-21.99 

16 

6 

22.00-22.99 

13 

7 

23.00-23.99 

10 

4 

24.00-24.99 

10 

4 

Total 

72,127 

17,986 


1 See Mills and Davenport, op. cit., p. 107. 








462 


THE THEORY OF SAMPLING 


16. Two types of electric bulbs were observed as to the length of life, 
and the following data were secured: 

Type 1 Type 2 

Ni = 100 N* = 100 

Mi = 1224 hours M 2 = 1036 hours 

(T 1 = 36 hours o- 2 = 40 hours 


Is the difference in the two means sufficient to warrant the conclusion 
that Type 1 is a bulb superior to Type 2? 

17. A large number of men were measured as to height giving M„ 
= 68.1 inches and <j u = 2.5 inches. How large a sample should be taken 
in order to be fairly sure (probability 0.95) that the sample mean may 
not differ from the true mean by more than rfc 0.5 inch? 

18. The weights of 400 male babies of same nationality were analyzed. 
The analysis yielded M = 7.29 pounds and a = 1.01 pounds. What 
statements can you make about the universe mean weight of babies of 
this nationality? If the universe mean were known to be 7.5 pounds, 
would you consider the above described sample a random one? 

19. ( Treloar , p. 143) “Data secured from the archives of the Sloane 
Hospital, New York City, for length of new-born infants of Irish parents 
yielded the following statistics 


Male (X) 

N = 1,136 
M — 51.96 cm. 
a = 2.181 cm. 


Female (F) 

N = 1,071 
M = 51.22 cm. 
<r = 2.189 cm. 


Do these results justify the inference that Irish male offspring are in 
general longer than females at birth? Do the results justify the inference 
that male babies are generally less variable in length than females at 
birth? 

20. The cost of building an identical house in various parts of the 
United States in 1940 gave 

M = $6,029 a = $459 N = number of cities = 77. 

The cost of building the same house during the first quarter of 1941 gave 
M = $6,232 a = $504 N = number of cities = 68. 

Is this increase in average-cost significant? 

21. The British Cotton Industry Research Association tested the break- 
ing load on two types of yarn with the following results: 



SUMMARY OF RELIABILITY FORMULAS 


463 


Type 1 

N = 1,782 
M = 6.83 oz. 
<r — 1.23 oz. 


Type II 

N = 1,914 
M = 7.48 oz. 
cr — 1.33 oz. 


Is the difference in the mean breaking-load significant? 

22. Karl Pearson and Alice Lee ( Biometrika , Vol. II, p. 415) secured 
the measurements of the stature of 1078 fathers and sons. The analysis 
yielded the results: 


Fathers 

Sons 

N = 1,078 

N = 1,078 

M = 67.70 inches 

M — 68.66 inches 

<t = 2.72 inches 

a — 2.75 inches 


r = 0.51 


Determine if the difference in the means is significant. 

23 . The following exercise is based upon data given in the “ Proceedings 
of the American Society for Testing Materials,” 1930, Vol. 30, Part II, 
pp. 448-455. A Committee of the Society, appointed to study corrosion, 
made numerous studies of the length of life of steel plates immersed in 
city water. The Committee found that the length of life was distributed 
normally. Numerous tests on No. 16 gauge sheets immersed in Washington 
tap water gave: M u = 1940 days and a u = 224 days. 

a. What is the probability that the mean of a sample of 100 sheets will 
not differ more than 25 days from Af„? 

b. Find the 5 per cent level of significance for the mean of a sample of 
100 sheets. 

c. What should be the size of sample in future tests in order that the 
probability will not be greater than of the sample mean being in error 
by more than 74 days? 

24 . The following item appeared in the New York Times November 22, 
1942. “TALL FRESHMEN — From Yale comes the news that the class 
of 1945 is the youngest and tallest that ever entered the university. 
Average freshman age is 18 years, 1 month and 11 days. Average height 
5 feet 8.5 inches. Compared with his predecessor of World War I the Yale 
freshman of today is ten pounds heavier and 1.7 inches taller. Of ail this 
year’s Yale freshmen 21.6 per cent (227 in actual numbers) are over 
six feet tali.” 

Assuming N = 1,000, & weight = 17 pounds and ^height = 2.5 inches, would 
you say the above item was noteworthy? 

26. The ages of 5,317 husbands and wives were secured and the analysis 
of the data yielded the results: 



464 


THE THEORY OF SAMPLING 


Husbands Wives 

N = 5,317 N = 5,317 

M = 42.8 years M = 40.6 years 

a = 13.1 years or = 12.7 years 

r = 0.91 

Basing your judgment on these data would you state that the difference 
in the means is significant? 



APPENDIX A 


SELECTED BOOKS FOR SUPPLEMENTARY READING 
A. General Texts 

B. H. Camp, Mathematical Part of Elementary Statistics } D. C. Heath 
and Company, 1931. 

F. E. Croxton and D. J. C owden, Avvlied General Statistics, Prentice- 
Hall, 1941. 

R. A. Fisher, Statistical Methods for Research Workers y 7th edition y 
Oliver and Boyd, London, 1938. 

John F. Kenney, Mathematics of Statistics , D. Van Nostrand Com- 
pany, 1939. 

H. L. Rietz, Mathematical Statistics, Open Court Publishing Com- 
pany, 1927. An excellent monograph for students who have had 
calculus and an elementary course in statistics. 

L. H. C. Tippett, The Methods of Statistics , 3rd edition , Williams and 
Norgate, London, 1941. This book is mainly one of interpreta- 
tions with the illustrations biological. 

Alan E. Treloar, Elements of Statistical Reasoning , John Wiley and 
Sons, 1939. This book is concerned mainly with interpretations. 

Albert E. Waugh, Elements of Statistical Method , McGraw-Hill Book 
Company, 1938. 

G. Udny Yule and M. G. Kendall, An Introduction to the Theory 
of Statistics , 12th edition , Charles Griffin and Company, London, 
1940. 


B. Texts in Special Fields 

R. E. Chaddock, Principles and Methods of Statistics , Houghton 
Mifflin Company, 1925. Mainly descriptive and philosophical. 
It is intended for students of the social sciences. 

Karl Holzinger, Statistical Methods for Students in Education , Ginn 
and Company, 1928. 


465 



466 


APPENDIX A 


F. C. Mills, Statistical Methods Applied to Economics and Business , 
Revised , Henry Holt and Company, 1938. 

Raymond Pearl, Medical Biometry and Statistics , second edition , W. B. 
Saunders and Company, 1930. 

George W. Snedecor, Statistical Methods Applied to Experiments in 
Agriculture and Biology , The Iowa State College Press, 1940. 

Herbert Sorenson, Statistics for Students of Psychology and Education , 
McGraw-Hill Book Company, 1936. 

C. Texts on Related Mathematical Topics 

J. L. Coolidge, An Introduction to Mathematical Probability, Oxford 
University Press, 1925. A careful analysis of the fundamentals 
of the theory of probability. 

Mordecai Ezekiel, Methods of Correlation Analysis , second edition , 
John Wiley and Sons, 1941. 

Joseph Lipka, Graphical and Mechanical Computation , Part II. 
John Wiley and Sons, 1918. An excellent reference for curve- 
fitting which may be used in connection with that of Running. 

H. L. Rietz, and others, Handbook of Mathematical Statistics, Hough- 
ton Mifflin Company, 1924. A useful reference book for one who 
has a good background in mathematics and statistics. A collection 
of chapters on important statistical topics. 

T. R. Running, Empirical Formulas , John Wiley and Sons, 1917. 
A valuable reference for curve-fitting. 

J. B. Scarborough, Numerical Mathematical Analysis, Johns Hopkins 
Press, 1930. Excellent chapters on interpolation, the normal 
curve, least squares, and empirical formulas. A valuable book. 

Hugh H. Wolfenden, The Fundamental Principles of Mathematical 
Statistics , The Actuarial Society of America, New York, 1942. 

D. Graphical Methods 

W. C. Brinton, Graphic Methods for Presenting Facts , Engineering 
Magazine Company, New York, 1914. 

S. C. Haskell, How to Make and Use Graphic Charts , Codex Book 
Company, New York, 1923. 

K. G. Karsten, Charts and Graphs , Prentice-Hall, New York, 1923. 



SUPPLEMENTARY READING 


46? 


E. Aids in Calculation 

J. W. Glover, Tables of Applied Mathematics and Statistics , George 
Wahr, Ann Arbor, Mich., 1923. This book contains many helpful 
tables. 

Karl Pearson, Tables for Statisticians ar\d Biometricians , Part I, 
second edition , Cambridge University Press. These tables are 
indispensable to the advanced student. 

Mathematical Tables from Handbook of Chemistry and Physics, Chemi- 
cal Rubber Company, Cleveland. 

F. Sources for Current Statistical Data 

Statistical Abstract of the United States, published annually by the 
Government Printing Office, Washington, D.C. 

Yearbook of Agriculture, published annually by the Government 
Printing Office, Washington, D.C. 

Survey of Current Business, United States Department of Commerce, 
Washington, D.C. Published monthly. 

World Almanac and Book of Facts, New York World, New York. 


APPENDIX B 


AREAS AND ORDINATES OF THE NORMAL CURVE 


0(0 - 


l 

V2^ 


e 2 


The following table gives the values of the area under the curve 
from the ordinate at t = 0 to the ordinate for the values of t given 
in the column at the left. Values of the ordinate are also given. 



468 AREAS AND ORDINATES OF NORMAL CURVE 


t 



t 

A *X 

<Kt) 

t 

i — i 

- e - 

0(0 

t 


m 

.00 

.0000 

.3989 

.40 

.1554 

.3683 

.80 

.2881 

.2897 

1.20 

.3849 

.1942 

.01 

.0040 

.3989 

.41 

.1591 

.3668 

.81 

.2910 

.2874 

1.21 

.3869 

.1919 

.02 

.0080 

.3989 

.42 

.1628 

.3653 

.82 

.2939 

.2850 

1.22 

.3888 

.1895 

.03 

.0120 

.3088 

.43 

.1664 

.3637 

.83 

.2967 

.2827 

1.23 

.3907 

;1872 

.04 

.0160 

.3986 

.44 

.1700 

.3621 

.84 

.2996 

.2803 

1.24 

.3925 

.1849 

.05 

.0199 

.3984 

.45 

1.1736 

.3605 

.85 

.3023 

.2780 

1.25 

.3944 

.1827 

.06 

.0239 

.3982 

.46 

.1772 

.3589 

.86 

.3051 

1.2756 

1.26 

.3962 

.1804 

.07 

.0279 

.3980 

.47 

.1808 

.3572 

.87 

.3079 

.2732 

1.27 | 

.3980 

.1781 

.08 

.0319 

.3977 

.48 

.1844 

.3555 

.88 

.3106 

.2709 

1.28 ! 

.3997 

.1759 

.09 

.0359 

.3973 

.49 

.1879 

.3538 

.89 

.3133 

.2685 

1.29 

.4015 

.1736 

.10 

.0398 

.3970 

.50 

.1915 

.3521 

.90 

.3159 

.2661 

1.30 1 

.4032 

.1714 

.11 

.0438 

.3965 

.51 

.1950 

.3503 

.91 

.3186 

.2637 

1.31 

.4049 

.1692 

.12 

.0478 

.3961 

.52 

.1985 

.3485 

.92 

.3212 

.2613 

1.32 

.4066 

.1669 

.13 

.0517 

.3956 

.53 

.2019 

. 3467 - 

.93 

.3238 

.2589 

1.33 

.4082 

.1647 

.14 

.0557 

.3951 

.54 

.2054 

.3448 

.94 

.3264 

.2565 

1.34 

.4099 

.1626 

.15 

.0596 

.3945 

.55 

.2088 

.3429 

.95 

.3289 

.2541 

1.35 

.4115 

.1604 

.16 

.0636 

.3039 

.56 

.2123 

.3411 

.96 

.3315 

.2516 

1.36 

.4131 

.1582 

.17 

.0675 

|.3932 

.57 

.2157 

.3391 

.97 

.3340 

.2492 

1.37 

.4147 

.1561 

.18 

.0714 

.3925 

.58 

.2190 

.3372 

.98 

.3365 

.2468 

1.38 

.4162 

.1540 

.19 

.0754 

.3918 

.59 

.2224 

.3352 

.99 

.3389 

.2444 

1.39 

.4177 

.1518 

.20 

.0793 

.3910 

.60 

.2258 

.3332 

1.00 

.3413 

.2420 

1.40 

.4192 

.1497 

.21 

.0832 

.3902 

.61 

.2291 

.3312 

1.01 

.3438 

.2396 

1.41 

.4207 

.1476 

.22 

.0871 

.3894 

.62 

.2324 

.3292 

1.02 

.3461 

.2371 

1.42 

.4222 

.1456 

.23 

.0910 

.3885 

.63 

.2357 

.3271 

1.03 

.3485 

.2347 

1.43 

.4236 

.1435 

.24 

.0948 

.3876 

.64 

.2389 

.3251 

1.04 

.3508 | 

.2323 

1.44 

.4251 

.1415 

.25 

.0987 

.3867 

.65 

.2422 

.3230 

1.05 

.3531 

.2299 

1.45 

.4265 

.1394 

.26 

.1026 

.3857 

.66 

.2454 

.3209 

1.06 

.3554 

.2275 

1.46 

.4279 

.1374 

.27 

.1064 

.3847 

.67 

.2486 

.3187 

1.07 

.3577 

.2251 

1.47 

.4292 

.1354 

.28 

.1103 

.3836 

.68 

.2518 

.3166 

1.08 

.3599 

.2227 

1.48 

.4306 

.1334 

.29 

.1141 

.3825 

.69 

.2549 

.3144 

1.09 

.3621 

.2203 

1.49 

.4319 

.1315 

.30 

.1179 

.3814 

.70 

.2580 

.3123 

1.10 

.3643 

.2179 

1.50 

.4332 

.1295 

.31 

.1217 

. 3802 - 

.71 

.2612 

.3101 

1.11 

.3665 

.2155 

1.51 

.4345 | 

.1276 

.32 

.1255 

.3790 

.72 

.2642 

.3079 

1.12 

.3686 

.2131 

1.52 

.4357 

.1257 

.33 

.1293 

.3778 

.73 

.2673 

.3056 

1.13 

.3708 

.2107 

1.53 

.4370 

.1238 

.34 

.1331 

.3765 

.74 

.2704 

.3034 

1.14 

.3729 

.2083 

1.54 

.4382 

.1219 

.35 

.1368 

.3752 

.75 

.2734 

.3011 

1.15 

.3749 

.2059 

1.55 

.4394 

.1200 

.36 

.1406 

.3739 

.76 

.2764 

.2989 

1.16 

.3770 

.2036 

1.56 

.4406 

.1182 

.37 

.1443 

.3726 

.77 

.2794 

.2966 

1.17 

.3790 

.2012 

1.57 

.4418 

.1163 

.38 

.1480 

.3712 

.78 

.2823 

.2943 

1.18 

.3810 

.1989 

1.58 

.4430 

.1145 

.39 

.1517 

.3697 

1 .79 

.2852 

.2920 

1.19 

.3830 

.1965 

1.59 

.4441 

.1127 


AREAS AND 

ORDINATES 

OF NORMAL 

CURVE 

469 

t AtJ 4 >( t ) t 

A+H 4 >( t ) t 

M 

- e - 

1 

4>(t) 


1.60 

.4452 

.1109 

2.00 

.4773 

.0540 

2.40 

.4918 

.0224 

2.80 

.4974 

.0079 

1.61 

.4463 

.1092 

2.01 

.4778 

.0529 

2.41 

.4920 

.0219 

2.81 

.4975 

.0077 

1.62 

.4474 

.1074 

2.02 

.4783 

.0519 

2.42 

.4922 

.0213 

2.82 

.4976 

.0075 

1.63 

.4485 

.1057 

2.03 

.4788 

.0508 

2.43 

.4925 

.0208 

2.83 

.4977 

.0073 

1.64 

.4495 

.1040 

2.04 

.4793 

.0498 

2.44 

.4927 

.0203 

2.84 

.4977 

.0071 

1.65 

.4505 

.1023 

2.05 

.4798 

.0488 

2.45 

.4929 

.0198 

2.85 

.4978 

.0069 

1.66 

.4515 

.1006 

2.06 

.4803 

.0478 

2.46 

.4931 

.0194 

2.86 

.4979 

.0067 

1.67 

.4525 

.0989 

2.07 

.4808 

.0468 

2.47 

.4932 

.0189 

2.87 

.4980 

.0065 

1.68 

.4535 

.0973 

2.08 

.4812 

.0459 

2.48 

.4934 

.0184 

2.88 

.4980 

.0063 

1.69 

.4545 

.0957 

2.09 

.4817 

.0449 

2.49 

.4936 

.0180 

2.89 

.4981 

.0061 

1.70 

.4554 

.0941 

2.10 

.4821 

.0440 

2.50 

.4938 

.0175 

2.90 

.4981 

.0060 

1.71 

.4564 

.0925 

2.11 

.4826 

.0431 

2.51 

.4940 

.0171 

2.91 

.4982 

.0058 

1.72 

.4573 

.0909 

2.12 

.4830 

.0422 

2.52 

.4941 

.0167 

2.92 

.4983 

.0056 

1.73 

.4582 

.0893 

2.13 

.4834 

.0413 

2.53 

.4943 

.0163 

2.93 

.4983 

.0055 

1.74 

.4591 

.0878 

2.14 

.4838 

.0404 

2.54 

.4945 

.0159 

2.94 

.4984 

.0053 

1.75 

.4599 

.0863 

2.15 

.4842 

.0396 

2.55 

.4946 

.0155 

2.95 

.4984 

.0051 

1.76 

.4608 

.0848 

2.16 

.4846 

.0387 

2.56 

.4948 

.0151 

2.96 

.4985 

.0050 

1.77 

.4616 

.0833 

2.17 

.4850 

.0379 

2.57 

.4949 

.0147 

2.97 

.4985 

.0049 

1.78 

.4625 

.0818 

2.18 

.4854 

.0371 

2.58 

.4951 

.0143 

2.98 

.4986 

.0047 

1.79 

.4633 

.0804 

2.19 

.4857 

.0363 

2.59 

.4952 

.0139 

2.99 

.4986 

.0046 

1.80 

.4641 

.0790 

2.20 

.4861 

.0355 

2.60 

.4953 

.0136 

3.00 

.4987 

.0044 

1.81 

.4649 

.0775 

2.21 

.4865 

.0347 

2.61 

.4955 

.0132 

3.01 

.4987 

.0043 

1.82 

.4656 

.0761 

2.22 

.4868 

.0339 

2.62 

.4956 

.0129 

3.02 

.4987 

.0042 

1.83 

.4664 

.0748 

2.23 

.4871 

.0332 

2.63 i 

.4957 

.0126 

3.03 

.4988 

.0041 

1.84 

.4671 

.0734 

2.24 

.4875 

.0325 

2.64 

.4959 

.0122 

3.04 

.4988 

.0039 

1.85 

.4678 

.0721 

2.25 

.4878 

.0317 

2.65 

.4960 

.0119 

3.05 

.4989 

.0038 

1.86 

.4686 

.0707 

2.26 

.4881 

.0310 

2.66 

.4961 

.0116 

3.06 

.4989 

.0037 

1.87 

.4693 

.0694 

2.27 

.4884 

.0303 

2.67 

.4962 

.0113 

3.07 

.4989 

.0036 

1.88 

.4700 

.0681 

2.28 

.4887 

.0297 

2.68 

.4963 

.0110 

3.08 

.4990 

.0035 

1.89 

.4706 

.0669 

2.29 

.4890 

.0290 

2.69 

.4964 

.0107 

3.09 

.4990 

.0034 

1.90 

.4713 

1.0656 

2.30 

.4893 

.0283 

2.70 

.4965 

.0104 

3.10 

.4990 

.0033 

1.91 

.4719 

.0644 

2.31 

.4896 

.0277 

2.71 

.4966 

.0101 

3.11 

.4991 

.0032 

1.92 

.4726 

.0632 

2.32 

.4898 

.0271 

2.72 

.4967 

.0099 

3.12 

.4991 

.0031 

1.93 

.4732 

.0620 

2.33 

.4901 

.0264 

2.73 

.4968 

.0096 

3.13 

.4991 

.0030 

1.94 

.4738 

.0608 

2.34 

.4904 

.0258 

2.74 

.4969 

.0094 

3.14 

.4992 

.0029 

1.95 

.4744 

.0596 

2.35 

.4906 

.0252 

2.75 

.4970 

.0091 

3.15 

.4992 

.0028 

1.96 

.4750 

.0584 

2.36 

.4909 

.0246 

2.76 

.4971 

.0089 

3.16 

.4992 

.0027 

1.97 

.4756 

.0573 

2.37 

.4911 

.0241 

2.77 

.4972 

.0086 

3.17 

.4992 

.0026 

1.98 

.4762 

.0562 

2.38 

.4913 

.0235 

2.78 

.4973 

.0084 

3.18 

.4993 

.0025 

1.99 

.4767 

.0551 

2.39 

.4916 

.0229 

2.79 

.4974 

.0081 

3.19 

.4993 

.0025 


470 AREAS AND ORDINATES OF NORMAL CURVE 





n 



n 

EP 

||i9 

t 

• e - 

i i 

o •*. 

<KD 

3.20 





.0009 

3.80 

.4999 

HI 

4.10 

.5000 

.0001 

3.21 

.4993 

.0023 

3.51 

.4998 

.0008 

3.81 

.4999 

.0003 

4.11 

.5000 

.0001 

3.22 

.4994 

.0022 

3.52 

.4998 

.0008 

3.82 

.4999 

.0003 

4.12 

.5000 

.0001 

3.23 

.4994 

.0022 

3.53 

.4998 

.0008 

3.83 

.4999 

.0003 

4.13 

.5000 

.0001 

3.24 

.4994 

.0021 

3.54 

.4998 

.0008 

3.84 

.4999 

.0003 

4.14 

.5000 

.0001 

3.25 

.4994 

.0020 

3.55 

.4998 

.0007 

3.85 

.4999 

.0002 

4.15 

.5000 

.0001 

3.26 

,4994 

.0020 

3.56 

.4998 

.0007 

3.86 

.4999 

.0002 

4.16 

.5000 

.0001 

3.27 

.4995 

.0019 

3.57 

.4998 

.0007 

3.87 

.5000 

.0002 

4.17 

.5000 

.0001 

3.28 

.4995 

,0018 

3.58 

.4998 

.0007 

3.88 

.5000 

.0002 

4.18 

.5000 

.0001 

3.29 

.4995 

.0018 

3.59 

.4998 

.0006 

3.89 

.5000 

.0002 

4.19 

.5000 

.0001 

3.30 

.4995 

.0017 

3.60 

.4998 

.0006 

3.90 

.5000 

.0002 

4.20 

.5000 

.0001 

3.31 

.4995 

.0017 

3.61 

.4999 

.0006 

3.91 

.5000 

.0002 

4.21 

.5000 

.0001 

3.32 

.4996 

.0016 

3.62 

.4999 

.0006 

3.92 

.5000 

.0002 

4.22 

.5000 

.0001 

3.33 

.4996 

.0016 

3.63 

.4999 

.0006 

3.93 

.5000 

.0002 

4.23 

.5000 

.0001 

3.34 

.4996 

.0015 

3.64 

.4999 

.0005 

3.94 

.5000 

.0002 

4.24 

.5000 

.0001 

3.35 

.4996 

.0015 

3.65 

.4999 

.0005 

3.95 

.5000 

.0002 

4.25 

.5000 

.0001 

3.36 

.4996 

.0014 

3.66 

.4999 

.0005 

3.96 

.5000 

.0002 

4.26 

.5000 

.0001 

3.37 

.4996 

.0014 

3.67 

.4999 

.0005 

3.97 

.5000 

.0002 

4.27 

.5000 

.0000 

3.38 

.4996 

.0013 

3.68 

.4999 

.0005 

3.98 

.5000 

.0001 

4.28 

.5000 

.0000 

3.39 

.4997 

.0013 

3.69 

.4999 

.0004 

3.99 

.5000 

.0001 

4.29 

.5000 

.0000 

3.40 

.4997 

.0012 

3.70 

.4999 

.0004 

4.00 

.5000 

.0001 




3.41 

.4997 

.0012 

3.71 

.4999 

.0004 

4.01 

.5000 

.0001 




3.42 

.4997 

.0012 

3.72 

.4999 

.0004 

4.02 

.5000 

.0001 




3.43 

.4997 

.0011 

3.73 

.4999 

.0004 

4.03 

.5000 

.0001 




3.44 

.4997 

.0011 

3.74 

.4999 

.0004 

4.04 

.5000 

.0001 




3.45 

.4997 

.0010 

3.75 

.4999 

.0004 

4.05 

.5000 

.0001 




3.46 

.4997 

.0010 

3.76 

.4999 

.0003 

4.06 

.5000 

.0001 




3.47 

.4997 

.0010 

3.77 

.4999 

.0003 

4.07 

.5000 

.0001 




3.48 

.4998 

.0009 

3.78 

.4999 

.0003 

4.08 

.5000 

.0001 




3.49 

.4998 

.0009 

3.79 

.4999 

.0003 

4.09 

.5000 

.0001 









APPENDIX C 

TABLES OF LOGARITHMS AND ANTILOGARITHMS 

FOUR-PLACE LOGARITHMS 


N 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0000 

0043 

0086 

0128 

0170 

0212 

0253 

0294 

0334 

0374 

11 

0414 

0453 

0492 

0531 

0569 

0607 

0645 

0682 

0719 

0755 

12 

0792 

0828 

0864 

0899 

0934 

0969 

1004 

1038 

1072 

1106 

13 

1139 

1173 

1206 

1239 

1271 

1303 

1335 

1367 

1399 

1430 

14 

1461 

1492 

1523 

1553 

1584 

1614 

1644 

1673 

1703 

1732 

15 

1761 

1790 

1818 

1847 

1875 

1903 

1931 

1959 

1987 

2014 

16 

2041 

2068 

2095 

2122 

2148 

2175 

2201 

2227 

2253 

2279 

17 

2304 

2330 

2355 

2380 

2405 

2430 

2455 

2480 

2504 

2529 

18 

2553 

2577 

2601 

2625 

2648 

2672 

2695 

2718 

2742 

2765 

19 

2788 

2810 

2833 

2856 

2878 

2900 

2923 

2945 

2967 

2989 

20 

3010 

3032 

3054 

3075 

3096 

3118 

3139 

3160 

3181 

3201 

21 

3222 

3243 

3263 

3284 

3304 

3324 

3345 

3365 

3385 

3404 

22 

3424 

3444 

3464 

3483 

3502 

3522 

3541 

3560 

3579 

3598 

23 

3617 

3636 

3655 

3674 

3692 

3711 

3729 

3747 

3766 

3784 

24 

3802 

3820 

3838 

3856 

3874 

3892 

3909 

3927 

3945 

3962 

25 

3979 

3997 

4014 

4031 

4048 

4065 

4082 

4099 

4116 

4133 

26 

4150 

4166 

4183 

4200 

4216 

4232 

4249 

4265 

4281 

4298 

27 

4314 

4330 

4346 

4362 

4378 

4393 

4409 

4425 

4440 

4456 

28 

4472 

4487 

4502 

4518 

4533 

4548 

4564 

4579 

4594 

4609 

29 

4624 

4639 

4654 

4669 

4683 

4698 

4713 

4728 

4742 

4757 

30 

4771 

4786 

4800 

4814 

4829 

4843 

4857 

4871 

4886 

4900 

31 

4914 

4928 

4942 

4955 

4969 

4983 

4997 

5011 

5024 

5038 

32 

5051 

5065 

5079 

5092 

5105 

5119 

5132 

5145 

5159 

5172 

33 

5185 

5198 

5211 

5224 

5237 

5250 

5263 

5276 

5289 

5302 

34 

5315 

5328 

5340 

5353 

5366 

5378 

5391 

5403 

5416 

5428 

35 

5441 

5453 

5465 

5478, 

5490 

5502 

5514 

5527 

5539 

5551 

36 

5563 

5575 

5587 

5599 

5611 

5623 

5635 

5647 

5658 

5670 

37 

5682 

5694 

5705 

5717 

5729 

5740 

5752 

5763 

5775 

5786 

38 

5798 

5809 

5821 

5832 

5843 

5855 

5866 

5877 

5888 

5899 

39 

5911 

5922 

5933 

5944 

5955 

5966 

5977 

5988 

5999 

6010 

40 

6021 

6031 

6042 

6053 

6064 

6075 

6085 

6096 

6107 

6117 

41 

6128 

6138 

6149 

6160 

6170 

6180 

6191 

6201 

6212 

6222 

42 


6243 

6253 

6263 

6274 

6284 

6294 

6304 

6314 

6325 

43 


6345 

6355 

6365 

6375 

6385 

6395 

6405 

6415 

6425 

44 

6435 

6444 

6454 

6464 

6474 

6484 

6493 

6503 

6513 

6522 

45 

6532 

6542 

6551 

6561 

6571 

6580 

6590 

6599 

6609 

6618 

46 

6628 

6637 

6646 


6665 

6675 

6684 

6693 

6702 

6712 

47 

6721 

6730 

6739 

6749 

6758 

6767 

6776 

6785 

6794 

6803 

48 

6812 

6821 

6830 

6839 

6848 

6857 

6866 

6875 

6884 

6893 

49 

iKma 

6911 

6920 

6928 


6946 

6955 

6964 

6972 

6981 

El 

|| 6990 

6998 


7016 


7033 

7042 

7050 

7059 

7067 

51 


7084 


7101 

7110 

7118 

7126 

7135 

7l43 

7152 

52 

7160 

7168 

7177 

7185 

7193 

7202 

7210 

7218 

7226 

7235 

53 

7243 

7251 

7259 

7267 

7275 

7284 

7292 

7300 

7308 

7316 

54 

7324 

7332 

7340 

7348 

7356 

7364 

7372 

7380 

7388 

7396 

N 

0 

1 

2 

3 

4 

1 6 

6 

7 

8 

9 


471 





472 


FOUR-PLACE LOGARITHMS 


7466 

7543 

7474 

7551 

7619 

7627 

7694 

7701 

7767 

7774 

7839 

7846 

7910 

7917 

7980 

7987 

8048 

8055 

8116 

8122 

8182 

8189 

8248 

8254 

8312 

8319 

8376 

8382 

8439 

8445 

8500 

8506 

8561 

8567 

8621 

8627 

8681 

8686 

8739 

8745 

8797 

8802 

8854 

8859 

8910 

8915 

8965 

8971 

9020 

9025 

9074 

9079 

9128 

9133 

9180 

9186 

9232 

9238 

9284 

9289 

9335 

9340 

9385 

9390 

9435 

9440 

9484 

9489 

9533 

9538 

9581 

9586 

9628 

9633 

9675 

9680 

9722 

9727 

9768 

9773 

9814 

9818 

9859 

9863 

9903 

9908 

9948 

9952 

9991 

9996 





ANTILOGARITHMS 


473 


Loga- 

rithms 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

.00 

1000 

1002 

1005 

1007 

1009 

1012 

1014 

1016 

1019 

1021 

.01 

1023 

1026 

1028 

1030 

1033 

1035 

1038 

1040 

1042 

1045 

.02 

1047 

1050 

1052 

1054 

1057 

1059 

1062 

1064 

1067 

1069 

.03 

1072 

1074 

1076 

1079 

1081 

1084 

1086 

1089 

1091 

1094 

.04 

1096 

1099 

1102 

1104 

1107 

1109 

1112 

1114 

1117 

1119 

.05 

1122 

1125 

1127 

1130 

1132 

1135 

1138 

1140 

1143 

1146 

.06 

1148 

1151 

1153 

1156 

1159 

1161 

1164 

1167 

1169 

1172 

.07 

1175 

1178 

1180 

1183 

1186 

1189 

1191 

1194 

1197 

1199 

.08 

1202 

1205 

1208 

1211 

1213 

1216 

1219 

1222 

1225 

1227 

.09 

1230 

1233 

1236 

1239 

1242 

1245 

1247 

1250 

1253 

1256 

.10 

1259 

1262 

1265 

1268 

1271 

1274 

1276 

1279 

1282 

1285 

.11 

1288 

1291 

1294 

1297 

1300 

1303 

1306 

1309 

1312 

1315 

.12 

1318 

1321 

1324 

1327 

1330 

1334 

1337 

1340 

1343 

1346 

.13 

1349 

1352 

1355 

1358 

1361 

1365 

1368 

1371 

1374 

1377 

.14 

1380 

1384 

1387 

1390 

1393 

1396 

1400 

1403 

1406 

1409 

.15 

1413 

1416 

1419 

1422 

1426 

1429 

1432 

1435 

1439 

1442 

.16 

1445 

1449 

1452 

1455 

1459 

1462 

1466 

1469 

1472 

1476 

.17 

1479 

1483 

1486 

1489 

1493 

1496 

1500 

1503 

1507 

1510 

.18 

1514 

1517 

1521 

1524 

1528 

1531 

1535 

1538 

1542 

1545 

.19 

1549 

1552 

1556 

1560 

1563 

1567 

1570 

1574 

1578 

1581 

.20 

1585 

1589 

1592 

1596 

1600 

1603 

1607 

1611 

1614 

1618 

.21 

1622 

1626 

1629 

1633 

1637 

1641 

1644 

1648 

1652 

1656 

.22 

1660 

1663 

1667 

1671 

1675 

1679 

1683 

1687 

1690 

1694 

.23 

1698 

1702 

1706 

1710 

1714 

1718 

1722 

1726 

1730 

1734 

.24 

1738 

1742 

1746 

1750 

1754 

1758 

1762 

1706 

1770 

1774 

.25 

1778 

1782 

1786 

1791 

1795 

1799 

1803 

1807 

1811 

1816 

.26 

1820 

1824 

1828 

1832 

1837 

1841 

1845 

1849 

1854 

1858 

.27 

1862 

1866 

1871 

1875 

1879 

1884 

1888 

1892 

1897 

1901 

.28 

1905 

1910 

1914 

1919 

1923 

1928 

1932 

1936 

1941 

1945 

.29 

1950 

1954 

1959 

1963 

1968 

1972 

1977 

1982 

1986 

1991 

.30 

1995 

2000 

2004 

2009 

2014 

2018 

2023 

2028 

2032 

2037 

.31 

2042 

2046 

2051 

2056 

2061 

2065 

2070 

2075 

2080 

2084 

.32 

2089 

2094 

2099 

2104 

2109 

2113 

2118 

2123 

2128 

2133 

.33 

2138 

2143 

2148 

2153 

2158 

2163 

2108 

2173 

2178 

2183 

.34 

2188 

2193 

2198 

2203 

2208 

2213 

2218 

2223 

2228 

2234 

.35 

2239 

2244 

2249 

2254 

2259 

2265 

2270 

2275 

2280 

2286 

.36 

2291 

2296 

2301 

2307 

2312 

2317 

2323 

2328 

2333 

2339 

.37 

2344 

2350 

2355 

2360 

2366 

2371 

2377 

2382 

2388 

2393 

.38 

2399 

2404 

2410 

2415 

2421 

2427 

2432 

2438 

2443 

2449 

.39 

2455 

2460 

2466 

2472 

2477 

2483 

2489 

2495 

2500 

2506 

.40 

2512 

2518 

2523 

2529 

2535 

2541 

2547 

2553 

2559 

2564 

.41 

2570 

2576 

2582 

2588 

2594 

2600 

2606 

2612 

2618 

2624 

.42 

2630 

2636 

2642 

2649 

2655 

2661 

2667 

2673 

2679 

2685 

.43 

2692 

2698 

2704 

2710 

2716 

2723 

2729 

2735 

2742 

2748 

.44 

2754 

2761 

2767 

2773 

2780 

2786 

2793 

2799 

2805 

2812 

.45 

2818 

2825 

2831 

2838 

2844 

2851 

2858 

2864 

2871 

2877 

.46 

2884 

2891 

2897 

2904 

2911 

2917 

2924 

2931 

2938 

2944 

.47 

2951 

2958 

2965 

2972 

2979 

2985 

2992 

2999 

3006 

3013 

.48 

3020 

3027 

3034 

3041 

3048 

3055 

3062 

3069 

3076 

3083 

.49 

3090 

3097 

3105 

3112 

3119 

3126 

3133 

3141 

3148 

3155 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 


474 


ANTILOGARITHMS 


Loga- 

rithms 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

.50 

3162 

3170 

3177 

3184 

3192 

3199 

3206 

3214 

3221 

3228 

.61 

3236 

3243 

3251 

3258 

3266 

3273 

3281 

3289 

3296 

3304 

.62 

3311 

3319 

3327 

3334 

3342 

3350 

3357 

3365 

3373 

3381 

.53 

3388 

3396 

3404 

3412 

3420 

3428 

3436 

3443 

3451 

3459 

.54 

3467 

3475 

3483 

3491 

3499 

3508 

3516 

3524 

3532 

3540 

.55 

3548 

3556 

3565 

3573 

3581 

3589 

3597 

3606 

3614 

3622 

.56 

3631 

3639 

3648 

3656 

3664 

3673 

3681 

3690 

3698 

3707 

.57 

3715 

3724 

3733 

3741 

3750 

3758 

3767 

3776 

3784 

3793 

.58 

3802 

3811 

3819 

3828 

3837 

3846 

3855 

3864 

3873 

3882 

59 

3890 

3899 

3908 

3917 

3926 

3936 

3945 

3954 

3963 

3972 

.60 

3981 

3990 

3999 

4009 

4018 

4027 

4036 

4046 

4055 

4064 

.61 

4074 

4083 

4093 

4102 

4111 

4121 

4130 

4140 

4150 

4159 

.62 

4169 

4178 

4188 

4198 

4207 

4217 

4227 

4236 

4246 

4256 

.63 

4266 

4276 

4285 

4295 

4305 

4315 

4325 

4335 

4345 

4355 

.64 

4365 

4375 

4385 

4395 

4406 

4416 

4426 

4436 

4446 

4457 

.65 

4467 

4477 

4487 

4498 

4508 

4519 

4529 

4539 

4550 

4560 

.66 

4571 

4581 

4592 

4603 

4613 

4624 

4634 

4645 

4656 

4667 

.67 

4677 

4688 

4699 

4710 

4721 

4732 

4742 

4753 

4764 

4775 

.68 

4786 

4797 

4808 

4819 

4831 

4842 

4853 

4864 

4875 

4887 

.69 

4898 

4909 

4920 

4932 

4943 

4955 

4966 

4977 

4989 

5000 

.70 

5012 

5023 

5035 

5047 

5058 

5070 

5082 

5093 

5105 

5117 

.71 

5129 

5140 

5152 

5164 

5176 

5188 

5200 

5212 

5224 

5236 

.72 

5248 

5260 

5272 

5284 

5297 

5309 

5321 

5333 

5346 

5358 

.73 

5370 

5383 

5395 

5408 

5420 

5433 

5445 

5458 

5470 

5483 

.74 

5495 

5508 

5521 

5534 

5546 

5559 

5572 

5585 

5598 

5610 

.75 

5623 

5636 

5649 

5662 

5675 

5689 

5702 

5715 

5728 

5741 

.76 

5754 

5768 

5781 

5794 

5808 

5821 

5834 

5848 

5861 

5875 

.77 

5888 

5902 

5916 

5929 

5943 

5957 

5970 

5984 

6998 

6012 

.78 

6026 

6039 

6053 

6067 

6081 

6095 

6109 

6124 

6138 

6152 

.79 

6166 

6180 

6194 

6209 

6223 

6237 

6252 

6266 

6281 

6295 

.80 

6310 

6324 

6339 

6353 

6368 

6383 

6397 

6412 

6427 

6442 

.81 

6457 

6471 

6486 

6501 

6516 

6531 

6546 

6561 

6577 

6592 

.82 

6607 

6622 

6637 

6653 

6668 

6683 

6699 

6714 

6730 

6745 

.83 

6761 

6776 

6792 

6808 

6823 

6839 

6855 

6871 

6887 

6902 

.84 

6918 

6934 

6950 

6966 

6982 

6998 

7015 

7031 

7047 

7063 

.85 

7079 

7096 

7112 

7129 

7145 

7161 

7178 

7194 

7211 

7228 

.86 

7244 

7261 

7278 

7295 

7311 

7328 

7345 

7362 

7379 

7396 

.87 

7413 

7430 

7447 

7464 

7482 

7499 

7516 

7534 

7551 

7568 

.88 

7586 

7603 

7621 

7638 

7656 

7674 

7691 

7709 

7727 

7745 

.89 

7762 

7780 

7798 

7816 

7834 

7852 

7870 

7889 

7907 

7925 

.90 

7943 

7962 

7980 

7998 

8017 

8035 

8054 

8072 

8091 

8110 

.91 

8128 

8147 

8166 

8185 

8204 

8222 

8241 

8260 

8279 

8299 

.92 

8318 

8337 

8356 

8375 

8395 

8414 

8433 

8453 

8472 

8492 

.93 

8511 

8531 

8551 

8570 

8590 

8610 

8630 

8650 

8670 

8690 

.94 

8710 

8730 

8750 

8770 

8790 

8810 

8831 

8851 

8872 

8892 

.95 

8913 

8933 

8954 

8974 

8995 

9016 

9036 

9057 

9078 

9099 

.96 

9120 

9141 

9162 

9183 

9204 

9226 

9247 

9268 

9290 

9311 

.97 

9333 

9354 

9376 

9397 

9419 

9441 

9462 

9484 

9506 

9528 

.98 

9550 

9572 

9594 

9616 

9638 

9661 

9683 

9705 

9727 

9750 

.99 

9772 

9795 

9817 

9840 

9863 

9886 

9908 

9931 

9954 

9977 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 




ANSWERS TO EXERCISES 


CHAPTER 1 


Page 8 

1. + + 

p T 2 2 3 2 4 2 n 2 

2 . 2 1 + 2 2 + 2 3 + 2 4 + • • • + 2*. 

3. (1 - 3) + (2 - 3) + (3 - 3) + (4 - 3) +-•• + (»- 3). 

4. \qC\ + 10 C'l + 10^3 + • • • + 10^10- 

6 . a + 2a 2 + 3a 3 + 4a 4 + * • * + na n . 

6 . ioCi + 2 • io C‘i + 3 • 10 C 3 + 4 • 10 C 4 + • • • + 10 • iqCio. 

7. 10/(10) + 20/(20) + 30/(30) + • • • + 100/(100). 

8 . 25/(5) + 100/(10) + • • • + 6400/(80). 

9. ix{x + 1). 10. 2(Xi - M) 2 . 

1 <=1 


Page 10 

4. (1) 1,911. (2) 5,635. 


Pages 13-14 

I. a. = 0, (2x) 2 = 0, Sx 2 = 1,308, VZx 2 = 36.16. 

b. 2 £7 - 200, 2f/ 2 = 5,486, 2X = 700, 2Z 2 = 50,486. 

c. 2X 2 - 220, 2F 2 - 275, (2X)(2F) = 1,050, 2XY = 176. 

d. 2* = 0, 2?/ = 0, 2x 2 = 40, 2xy = - 34, = - 0.85. 

Pages 17-18 

1. (1) 4. (2) 3. (3) 2. (4) 3. (5) 5. 

3. 0.00004. 4. 0.00004. 6. 2%. 6. 0.147%. 7. 0.04%. 

8. 5.165 X 10 9 ; about 0.01%. 10. (1) 2,142. (2) 2,774. 

II. (1) 178.55. (2) 178.55. 12. (1) 310.53. (2) 310.53. 


Pages 21-22 


1 . 

7. 

13. 

18. 


363 db 0.5. 2. 24,725 ± 87.5. 3. W ± 0.112. 4. 4.05 sq. ft. 

The former. 11. 4n 3 + 4n 2 + 3n. 12. £[4n 5 + 33n J + 89n]. 

42,540. 15. (1) 8,888. (2) 123,464. 17. (1) 154,198. (2) 109,802. 
n(n. + l)(2n + 4) ' n(n + l)(2n + 7) 

6 19 * 6 

n(» + l)(n + 2)(3n + 5) 


21. 24,001,875. 


22 . 


475 


12 



476 


ANSWERS TO EXERCISES 


23. $5.13 per ton; 0.000192. 

26. $469,098,000; 0.178%. 

27. $330.75. 

29. 20%. 30. 9%. 

32. 93.5% of value in 1933. 


24. $150,000,000; 3.8%. 

26. 105.9 bu.; 0.29%. 

28. A: 39.37%; B: 37.25%. 

31. $187.50. 

33. 5.3 persons; 0.004. 


CHAPTER 3 


Pages 68-71 


1 . 

M = 7.5. 

2. M = 1.956. 

3. M = 53.7. 

8. 

M x = 27. 

9. M x = 360. 

10. M x = 237.25. 

11. 

M x = 260. 

12. M x = 537.3. 

13. M x = 206.25. 

14. 

M x = 448. 

16. M x = 20.2. 

22. M x = 53.7. 

Pages 74-76 



1 . 

M — 67.42 inches. 

2. M = 139.39 pounds. 

3. M = 6.06 inches. 

6. 

$31.87. 

6. $35.08. 

11. 22 cents. 


12. $16.77; $21.48; $15.90. 13. 1,000 lbs. per sq. in. 


Page 79 


1. M d = 1.98 cm. 

2. 

M d = $449.50. 

3. a. Mi = 

67.52 inches. 

4. 

$15.71; $21.89; $15. 

b. M d = 

138.12 pounds. 

6. 

M d = 991.3 lbs. per i 

7. 22. 


8. 

Md = 53.7 rays. 

Page 85 




King 

Parabola 

Pearson 

Unit 

1. 1.998 

1.997 

2.02 

cm. 

2. 53.07 

53.34 

53.64 

rays 

3. 68.00 

68.01 

67.72 

in. 

4. 6.01 

6.02 

6.03 

in. 

Page 91 




a. 3.27%. 

b. 3.45%. 

c. 2.5%. 


Pages 91-92 




1. M a = 17,043. 


2. 9.32%. 

3. 20.5%; $795; $632.03; 

$502.46. 

4. 2.62%. 


Pages 97-98 

1. 26 mi. per day. 2. 66^ per bu. 3. 15^ per gal. 4. 24§ days, i 
6. a. 9f units per hr. 6. 45 mi. per hr. 

b. 6£ min. 7. a. 8 problems per hr. 

c. 384. b. 7.5 min. per problem. 



ANSWERS TO EXERCISES 


477 


Pages 101-10 

2. 86%. 3. 5.8%. 4. a. M = 149.799. b. M = 150.77. 

6. a. Mj = 151.09. b. M d = 153.73. 

6. a. M. = 165.46. b. M. = 164.74. 

7. a. 0.154. b. M„ = 8,535.6. 8 . M = 38.1 years. 

9 . M — 6.9 years. Mi = 4.1 years. 

10. 4.76%. Estimated values: 42,222; 53,278; 67,224; 170,428. 

11. a. 6.0759 pounds, b. 16.46fi per lb. 

14. 11.12?! per lb.; 8.99^ per lb. 

15. a. 56.32. b. 41.92. 16. a. 56.25. b. 41.95. 

17. 166.6 years. 18. 542.9 millions. 

19. Group A: M d = $7.54; M x = $7,395. 

Group B: M d = $7.54; M x = $7,015. 

20. 1.3%; 11,085,900. 21. 6(10) 6 . 

22. 21.4%; $485.58. 23. M = 648.7; M, = 399.1. 

24. 8.56%. 25. 7.875%; 2,120,350 ; 4.93%. 26. About 7.5 years. 

27. 4.99%. 28. jVs units per minute. 29. 22 cents. 

31. M = $1,371.72; M d = $1,365.37; M 0 = $1,432.83. 

32. M = 3.39%. 33. 108.9 millions of barrels. 

36. Cincinnati: $12.46; Cleveland: $22.33; St. Louis: $15.72. 

36. M = $1,280.01; M d = $1,264.07; M„ = $1,260.92. 

37. 10.6%. 38. 1,267.9 millions of dollars. 

39. At an infinite speed. 42. np. 

«. M = ^ri^ 2n+1 - u M ' = *■’ M » = 

CHAPTER 4 

Page 114 

1. 27%. 2. 47%. 3. No. 4. 16%; 51%. 

6. 7.5 to 8 inches. 6. 50 to 55 pounds. 

7. M = M d = Mo = 35 pounds for A and B. No. 

Page 119 

1. 7.2%. 

2* a. 

Qi = 65.85 inches. 

Q 3 = 69.06 inches. 

Q = 1.6 inches. 

Vq = 2.4%. 

4. 



A 

Do 

D, 

D 4 

d 5 

d 6 

L>7 

D » 

Z). 

a. 

90.8 

114.4 

128.0 

N 

139.4 

151.1 

162.3 ! 

171.4 

186.8 

203.8 

b. 

92.2 

114.5 

129.0 

140.6 

153.7 

163.0 

171.2 

187.3 

205.3 


b. 

Qi = 127.19 pounds. 
Qz =■ 149.63 pounds. 
Q = 11.22 pounds. 
= 8 . 1 %. 



478 


ANSWERS TO EXERCISES 


5. Absolute: D 6 — D 4 , Di — D 3 , etc. Relative: etc. 

i>6 + D\ D^ ~t“ Dz 

7 . 5.731, 5.859, 6.192 inches. 8 . Q x « 55.6, Q 3 = 75.4. 

9 . Cincinnati: $12.17 to $20.17. 

Cleveland: $17.89 to $25.48. 

St. Louis: $12.27 to $18.57. 

10. Qi = 5.94 inches, Q 3 = 6.19 inches, Q = 0.125 inch. No. 

11. Qi — 52.2 rays, Q 3 = 55.1 rays. 

12. $1,197.00 to $1,506.72. 13. No. 

Page 124 

1. a. 0, 86, 14.3. 

b. 0, 56, 9.3. 

c. 0, 156, 26.0. 

3. M =• 69.5; M.D. about M 

Pages 127-28 

1. Wheat: M == 14.34 bu. per acre; <* = 1.23 bu. per acre. 

Rye: M = 12.15 bu. per acre; or = 1.6 bu. per acre. 

Oats: M = 30.27 bu. per acre; a - 2.4 bu. per acre. 

3. M - 20, (r = 6.45 

Page 133 

2. Cincinnati: <r = $6.86, M = $16.77. 

Cleveland: <r = $6.28, M = $21.48. 

St. Louis: <t = $6.04, M = $15.90. 

3 . M = $1,280.01, a = $150.01. 

4. Af = $1,371.72, <x = $247.03, 7, = 18%. 

5. A: M = 100.95, = 101.02, M 0 = 101.1, cr = 13. 

B: M = 47.71, M d = 47.68, = 48.08, cr = 5.88. 


2. Wheat: 1.08 bu. per acre. 
Rye: 1.3 bu. per acre. 

Oats: 2.03 bu. per acre. 

= 11.32; 58.1%. 4. No. 


Page 139 
4. a. 



1st 

100 

2nd 

100 

3rd 

100 

4th 

100 

5 th 
100 

6th 

100 

7th 

100 

8th 

100 

9th 

100 

10th 

100 

Total 

M 

142.35 

138.75 

138.65 

139.05 

138.35 

139.35 

137.05 

138.75 

140.65 

138.55 

139.15 

<r 

22.8 

20.1 

19.1 

18.2 

14.9 

17.2 

16.2 

16.8 

17.7 

15.4 

18.03 


b. Gmeana = 1.36 pOUndS. Mmeana — 139.15 pOUndS. 
C. Seven. d. They are equal. 



ANSWERS TO EXERCISES 


/479 


Pages 147-49 

1. <r = 0.20 inch, M.D. = 0.159 inch. 

2. a — 0.19 cm., 49.3, Yes, M = 1.956 cm. 

3. M = 5, <r = 1.58, 971. 

4. a. M — 67.42 inches, a — 2.43 inches, F a = 0.036. 

b . M = 139.39 pounds, <r = 17.2 pounds, V a = 0.12. 

6. a. M = 149.8, <r = 42.5, F ff = 0.28. 

b. M = 150.8, o- = 42.2, = 0.28. 

6. a. M = 56.323 mm., o- = 2.404 mm., F„ = 0.043. 

b. M = 41.917 mm., o- = 1.385 mm., V a = 0.033. 

7. b. *\/3(]V 2 - 1). 9. 17. 

10. Q = 5.3, Qx = 66.7, Q, = 77.3, = 72, E x = 5.3, = 0.17, E a = 0.12 

11 . a. E m = 0.04 inch, E a = 0.03 inch, 
b. = 0.3 pound, E a = 0.2 pound. 

12. Em = 0.004 inch, E v = 0.003 inch. 

13. V at = 0.22, F„ = 0.26. 

14. Scores: E M = 0.37, Production: = 0.97. 

16. (1) 200. (2) 285, $90 and $30. 

16. The first distribution. 17. The distribution of weights. 

27. M = np, <j = Viipq. 28. X = ~ = M x . 

Page 164 



A 

B 

c 

D 

M 

20 

26.7 

13.3 

9.75 

M d 

20 

28.96 

11.04 

7.5 

a 

7.2 

7.25 

7.25 

6.42 

Sk 

0 

— 0.93 

0.93 

1.05 


CHAPTER 5 

Pages 167-68 

1. 0.0083. 2. a. - 0.125. b. 0.220. 3. a. - 0.047. b. 0.026. 

4. The unadjusted values are: 



a 

b 

M 

149.8 

150.77 

<T 

42.5 

42.2 

otz 

- 0.05 

- 0.08 

OLi 

2.65 

2.58 



480 


ANSWERS TO EXERCISES 



Unadjusted Values 

Adjusted Values 


a 

b 

a 

b 

M 

56.322 

41.917 

56.322 

41.917 

a 

2.404 

1.385 

2.386 

1.378 

Oil 

0.603 

0.128 

0.616 

0.130 

«4 

4.334 

2.8904 

4.373 

2.8832 


7. M = 6.062 inches, cr = 0.20 inch. 



Unadjusted 

Adjusted 

M 

39.835 

39.835 

M d 

39.831 


M„ 

39.802 


a 

2.052 

2.032 

0(3 

0.0287 

0.0296 



Unadjusted 

Adjusted 

M 1 

171.8917 

171.8917 

M d 

171.8858 


Mo 

173.407 


(T 

6.8236 

6.7992 

a 3 

0.1125 

0.1137 

a A 

3.194 

3.197 


CHAPTER 6 


Pages 177-78 


Year 

Relatives 

to 1909 

Year 

Relatives 
to 1909 

1909 

100 

1916 

90 

1910 

90 

1917 

80 

1911 

83 

1918 

72 

1912 

88 

1919 

78 

1913 

86 

1920 

76 

1914 

84 

1921 

61 

1915 

83 

1922 

71 


Year 

Relatives 
to 1909 

Link 

Relatives 

Year 

Relatives 

to 1909 

Link 

Relatives 

1909 

100 


1914 

138 

106 

1910 

106 

io6 

1915 

151 

109 

1911 

111 

105 

1916 

160 

106 

1912 

120 

108 

1917 

175 

110 

1913 

130 

108 

1918 

190 

109 


ANSWERS TO EXERCISES 


481 


Year 

Rel. 

Year 

Rel. 

1910 

67 

1920 

87 

1911 

69 

1921 

87 

1912 

72 

1922 

101 

1913 

80 

1923 

120 

1914 

77 

1924 

130 

1915 

75 

1925 

141 

1916 

80 

1926 

144 

1917 

81 

1927 

151 

1918 

62 

1928 

154 

1919 

71 

1929 

150 


Year 

Rel. 

(Corn) 

Rel. 

(Hogs) 

A verage 
Price Rel. 

1920 

96 

138 

117 

1921 

60 

84 

72 

1922 

94 

91 

93 

1923 

103 

75 

89 

1924 

140 

80 

110 

1925 

96 

117 

107 

1926 

91 

122 

107 

1927 

103 

99 

101 

1928 

107 

91 

99 

1929 

111 

100 

106 


Pages 183-84 

1 . 


Year 

1921 

1923 

1925 

1927 

1929 

Aggregative 

Relative 

100 

104.5 

103 

99.6 

96.9 


2 . 


Year 

1921 

1923 

1925 

1927 

1929 

Harmonic 

Mean of 
Relatives 

100 

119 

134 

i 

127 

128 













ANSWERS TO EXERCISES 


483 


Page 200 

6. 150.6. 7. 94.0; 118.3. 



1921 

1923 

(1) 

89.7 

94.7 

(2) 

91.4 

90.4 

(3) 

90.5 

92.5 

(4) 

79.7 

124.1 

(5) 

90.5 

92.2 


9. 147. 


CHAPTER 7 


Page 206 

1. 2.5. 3. - 4.5. 6. 0. 9. 3. 

2. 1. 4. - V 1 - 6. 0. . 10. - 2. 


Pages 209-10 

1. Y = 3X + 2. 2. Y- = 3X - 2. 

3. Y = - 3X + 2. 4. Y = - 3X - 2. 

6. a. Slope-intercept form, m = 3, b = — 4. 

6. a. Y = bX- 7. b. 2F = 3X + 10. c. Y = - X + 8. 

8. 7X — 5F + 4 = 0, — f, |. 13. Y = 3X — 1. 

14. 4X + 3F = 17. 16. Yes. 16. No 


Page 216 

Y = 2.975Y - 2.025. 


Pages 220-21 

1. R = 0.02799* + 10.122. 2. Z = 0.02w + 90.22. 


l 

w 

l 

w 

50 

91.22 

350 

97.22 

100 

92.22 

400 

98.22 

150 

93.22 

450 

99.22 

200 

94.22 

500 

100.22 

250 

95.22 

550 

101.22 

300 

96.22 

600 

102.22 



484 


ANSWERS TO EXERCISES 


Pages 225-26 

1. a. Y - 0.5X + 1. 
b. Y = 2.6X - 2. 

2. S = 0.52 T + 54.2, 80.2. 
4. a. W = 1.02L - 3.123. 


c. F = - 0.85X + 12.1. 

d. y = - 1.5X + 50. 

3. Y = 0.765X + 22.9, 68.8. 

6. L = - 0.675 T + 603.5, 552.875. 


T 

Observed 

L 

Computed 

L 

70 

556 

556.25 

80 

550 

549.5 

90 

542 

542.75 

100 

536 

536.00 

110 

530 

529.25 

120 

523 

522.5 

130 

515 

515.75 


Pages 229-30 

1. Y = 5.75X + 111.475 with X = 0 at 1919. 

2. (1) 7 = 2.65X + 18.304. (2) $28.90. 

3. (1) F = 3.483X + 26.44. (2) $43.86 millions. 

4. y = - 0.12X + 5.2. 


CHAPTER 8 


Pages 236-37 

3. (1) y = - 0.46X + 5.53. 

(3) S y = 0.53 thousand strikes and lockouts. 

(4) 1.85 thousand strikes and lockouts. 

(5) Computed Y — 0.65. 

4. (1) y = 9.82X - 29.7. 

(3) S y = $17.6. 

(4) Computed Y = $68.5. 

5. (1) y = 11.1X - 217.83. 

(3) Sy = $45.2. 

(4) Computed Y = $226.2. 

Page 243 

1. r = 0.95, y = 1.02X - 12.62, S v = 3.93 c.u. 

2. r ■= 0.89, y = 0.075X + 4.72, S y = 0.58 ton. 

3. r = 0.61, y = 3.32X + 21.93, S v — 3.3 bu. per acre. 



ANSWERS TO EXERCISES 


485 


Pages 260-62 


1. r = - 0.92. 

6. r = 0.95. 

10. c Tx — cr y ~ 3.42. 

11. a. crx * 1.414. 

G y ~ 2 . 828 . 

r = — 1. 
m = — 2 . 

y = - 2X + 

12. a. <rx = 4.87. 

tr Y = 3.78. 
r = 0. 
ra = 0. 
y = 4. 


2. r = 0.61. 3. a. r 

6. r = — 0.84. No. b. r 

r = ra = 1. y = X + 4. 

11. b. crx = 1.414. 

<j Y = 4.242. 
r = — 1. 
ra = — 3. 

12. y = - 3X + 13. 

12. b. crx = 3.78. 

o y = 2.27 . 
r = 0. 
ra — 0 . 

y = 3. 


- 0.62. 
- 0.55. 


Pages 260-262 

1. M x = 53.77^. M r = S7.82. r = 0.72. 

ox - 25.2^. cry = 04.34. y = 0.124X + 1.15. 

If X = 75, y ea ,. = $10.45. = $3.04. 

2. r = .40. y = 1.36X + 108.5. 

X = 0.12y + 13.2. 

3. M x = 16.25 min. M Y = 82.125%. r = - 0.92. 

crx = 5.04 min. cry = 9.25%. Y = - 1.68X + 109.4 

If X = 20, y Mt . = 76%. = 3.63%. 


Page 266 


1 . 0 . 92 . 

2. 0.85. 


3 . Pi ii — 0 . 64 . 
Pi iii — 0 . 62 . 
Pii hi — 0 . 78 . 


Pages 270-76 

1. (1) y = 0.075X + 4.72. 

(2) r = 0.89. Yes. 

(3) If X = 40, Y eat . = 7.72 tons. 

(4) S v = 0.58 ton. 

2. r = 0.92. 

3. r = 0.63. y = 1.2 IX + 4.73. 

X = 0.32Y + 14.3. 

4 . r = — 0.84. Yes. 

7. r = 0.60. y = 0.85X + 85.03. 

X = 0.43y - 19.3. 

10. (1) r = 0.829. y = 0.65X + 2.75. 

(2) y«„. = 6%. (3) M y = 6.14%. 

(4) X = 1.056Y - 1.26. (5) X est . = 6.13%. 

(6) Mx - 6.29%. (7) S v = 0.39%, & = 0.50%. 



486 


ANSWERS TO EXERCISES 


11. r = 0.80. 

12. (2) r = - 0.71. (3) m = - 0.32. 

(4) Y = - 0.32X + 187.69. (5) 155.7(5, 123.7(5, 91.7(5. 

(6) $„ = 33.6(5. 13. r = 0.73. 

14. r = 0.90. Yes, spurious. 15. (2) If X = $175, Y = $138.88. 

16. The Bucknell test. (3) $0,746. 


CHAPTER 9 

Page 282 

2 . (1) X 1 = 0.384X 2 + 1.646X 3 + 1.438. 

Pages 284-286 

2. (2) Xi = 0.839X2 + 0.462X3 - 0.270. 

(4) 72i( 23) = 0.83, <Si(23) = 1.47. 

3. (2) 7?k 23) = 0.96, iSi(23) = 5.02. 

(3) Xi = 0.258X2 + O.6O6X3 + 14.2. 

4. (1) X, = 0.575X2 + 1.092X3 + 15.982. 

(2) 158.6. (3) 35.9. 

Page 288 

6. (2) i?i(23) = 0.96, $k23) = 2.36. 

Page 290 

1. (- 1, 3). 2. (6, 0). 3. (24, - 16). 4. (f, - §). 

Page 292 

2. (- 2, 1, 3). 3. (- 1, 2, - 3). 

Page 293 

1. (1) - 15. (2) - 56. 

Page 297 

5. 72.88%. 

Pages 301-303 

1. a. Weight = 0.994 Length + 2.660 Breadth - 112.217. 

b. 55.25 grams. 

c. Weight = 0.046 Length + 1.056 Bulk - 2.081. 

d. Weight = 1.098 Bulk - 0.098 Breadth + 2.416. 

e. Swilbt) — 0.924, S wil bik) — 0.907, Sw(bubt) = 0.909, 

2. a. X, = 0.55X2 + 1.07X3 + 0.083X 4 - 69. 

c. i£i(234) — 0.826. 

d. 7*12*34 = 0.764, 7*13*24 = 0.676, 7*14-23 — 0.09. 



ANSWERS TO EXERCISES 


487 


CHAPTER 10 

Pages 310-311 

6. a = 1, b = - 2, c = 2. If X = 5, Y = 17. 

6. Y = X 3 - 2X. If X = 5, Y = 115. 

Page 313 

R = 0.02792J + 10.1368. 

Page 316 

3. With X = 0 at 1924, Y = 3.483X + 26.44. 

At 1929, X = 5, Y = 43.86 millions of dollars. 

Page 323 

2. Using L.S., p = 30(0.99996)*. 


h_ 

1,000 

2,000 

5,000 

P 

28.9 

27.9 | 

24.9 


3. L.S. gives T = 17.8021(0.9865)'. 

4. With t = 0 at 1920, L.S. gives X = 100,006(1.127)'. 

6. L.S. gives H = 0.86(1.39)". 

Pages 329-330 

2. The points (8, 23) and (20,360) give Y = 0.045X 3 . 7. T = 49.5/V 3S . 

4. Y = 0.119X 103 . 6. U = 2.26 1°\ 

Page 344 

1. a. log Y = (log 3)X + log 1, Y = 3 X . 

b. log Y = 0.2X + log 1, Y = 105. 

c. log Y = 0.1X + log 1, Y = 10T*. 

d. log Y = (log 5i)X + log 2, y = 2(55). 

e. log Y = - 0.1X + 1, Y = 10(10-° >) x . 

f. log Y = (log 2i)X + log 2-5, Y = 2^f*. 

Pages 330-354 

2. F = 0.045X 3 . 4. 7 = 0.04X 2 . 6. Y = 4(1. 2) x . 

7. X = 125(1.649)'. 10. Y = 2.54(1.16)' Y . 

16. Choosing X = 0 at 1909, Y = 0.305X + 7.36. 

17. Choosing X = 0 at 1907, Y = 14.375X + 159.31. 

At 1915, X = 8, 7 = 274.31. 

At 1920, X = 13, y = 346.185. 

18. Choosing X = 0 at 1909, a. Y = 74X + 1,988.7. 
b. y = 109.8X + 2,053.8. 

19. Choosing X — 0 at 1915, Y = 1.35.Y + 31.8. 



488 


ANSWERS TO EXERCISES 


Pages 357-361 

1. With X = 0 at 1910, L.S. gives Y = 0.435X + 35.2. 

2. L.S. gives V - 499.82p- 1066 . 

3. With X = 0 at 1910, L.S. gives 

Y = 0.0574X 2 + 2.67X + 94.66. 

4. With X = 0 at 1900, L.S. gives Y = 0.714(1.031)*. 

At 1915, X = 15, Y = 1.13. 

At 1928, X = 28, Y = 1.67. 

6. Using L.S., S = 44.603(1.049)®. 

S = O.OO147250 2 - 0.4740 + 49.548. 

6. L.S. gives V = 3.1944 + 0.4516D - 0.7792D 2 . 

If D = 0.9, V = 2.9697. 

7. Using first, seventh, and ninth points, 6 — 31.5 + 60(0.9038)'. 

8. With X = 0 at 1910, Y = 19(1.086)*. 

9. 1 = 4.480J5 0 - 6691 . 

14. L.S. gives with t = 0 at 1909.5, X = 393.3(1.0743)'. 

15. Using first, sixth, and eleventh points, y = 10.1344 + 1.7521(1.2404)*. 


CHAPTER 11 


Pages 365-366 

1. 4. . 2. 8. 

6. 504. 6. 2,730. 


3. 36. 

7. 3,024. 


Page 367 

1. 156. 2. 6,720. 

3. a. 362,880. b. 725,760. c. 725,760. 

4. 720. 5. 10. 7. 30,240. 8. 34,650. 


4. 288. 


d. 2,903,040. 
9. 2,520. 


Pages 369-370 


1. 45; 45; 4,950. 
6. a. 126. b. 84. 
12. 3,600. 

16. n = 6. 


2. 50,063,860. 3. 5,880. 

8. 302,400. 

13. a. 700. b. 1,408. 
17. n = 10. 


Pages 371-372 

1 2.048 1.992 

4 . 040 ) 4 . 040 * 

2. rib. tIs. etc, 

3. 0.0085. 


M = 3.43, cr = 1.3, 
4. 0.514. 


4. 45. 5. 63. 

9. 878,948,939. 
15. n = 11, r = 2. 
18. n = 7. 


Pages 373-374 

1* it i> ?• 

4. -h, A, i, h- 

7. i, h i. 

10 . 


2* sV- 

6. 3%. 

8. i,i,l 

11. i 


3- fg, i, 7. 

6. ii- 

9. The former. 

12. s^. 



ANSWERS TO EXERCISES 


489 


Pages 376-377 

1- &• 1,001 • t 
5. a. 0.06. 


u * 1 . 001 * 
b. 0.56. 


3. a. 


15(5 4 ) 


b. 


2,906 


6 6 * 6 6 
7. a. 0.2646. b. 0.3483. 
80 


9. a. 


3 5 


i 51 
b ‘ 3*" 


c. 0.38. 


Pages 380-382 

1. a. b. c. etc. 


3. *. 4. f|. 

6. h 7. I 


2. 1, 7, 21, etc. 
. 4,651 

*• -w 

8. 0.09. 

io. 2 ™. 

6 5 


6 . 


56 

2 “' 


11. 25. 


12. a. 10(.94) 3 (.06) 2 . b. (,06) 2 (.94) 3 . 

c. 10(.94) 3 (.06) 2 + 10(.94) 2 (.06) 3 + 5(.94)(,06) 4 + (,06) 5 . 

13. a. (,95) s . b. 10(.95) 2 (.05) 3 . 

c. 10(.95) 2 (.05) 3 + 5(.95)(.05) 4 + (,05) 5 . 

14. b. (.95)“. a. 10(.95) 9 (.05). c. (.95)“ + 10(.95) 9 (.05). 


10 

d. 2,oCr(.95)'(.05)“-'. 15. 25 C,o(.9) 20 (.l) 5 . 

r = 5 


16. a. ,oCfi(.97) 6 (.03) 4 . b. 


100 

17. 2 in oC r (.95)'(.05)“°-'. 

r=9U 


20. 


108 

, 799 

a. 

7 6 ‘ 

b. c. 

22. 

24. 

a. 

4. 

i 

1.024* 

b. c. 

26. i 


2ioC , r(.97) r (.03) 10_r . 

•=a 

19. |. 

f • 21. a. I b. JS. 

W‘ 23. a. 5. b. 

26. I 27. 


c - tIs* 


CHAPTER 12 


Page 390 



a 

6 

c 

d 

Mo 

5 

1 

4 

2 and 3 

M 

5 

1 

3.6 

2.4 

O’ 

0.91 

0.91 

0.6 

0.98 

«3 

- 0.73 

0.73 

1.33 

0.20 



490 


ANSWERS TO EXERCISES 


Page 395 

1. A B 


X 

Graduated, 

fix) 

60.5 

3.5 

70.5 

28.3 

80.5 

99.0 

90.5 

198.0 

100.5 

247.5 

110.5 

198.0 

120.5 

99.0 

130.5 

28.3 

140.5 

3.5 

Total 

905.1 


X 

Graduated 

fix) 

29.5 

2.0 

33.5 

17.6 

37.5 

70.3 

41.5 

164.1 

45.5 

246.1 

49.5 

246.1 

53.5 

164.1 

57.5 

70.3 

61.5 

17.6 

65.5 

2.0 

Total 

1,000.2 


Page 404 

1. a. 0.9773. 

b. 0.9836. 

c. 0.9834. 

2. a. 2.14. 


d. 0.0227. g. 0.0227. 

e. 0.9918. h. 0.9892. 

f. 0.0084. i. 0.0027. 

b. 0.65. c. 1.655. d. 0.6553. 


Pages 411-413 

1. t = 5. No. 

2. a. t - 6.1. Yes. b. t = 3.46. Yes. From point of view of chance, 
this might happen, but very improbable. 

3. 0.0108. 


Grade 

F 

E 

D 

C- 

C 

c+ 

B- 

B 

B+ 

A- 

A 

A+ 

Number 













receiving 

grade 

1 

7 

28 

79 

159 

226 

226 

159 

79 

28 

7 

1 


7. The values of f for the five parts are: — oo to — 0.8415, — 0.8415 to 
- 0.2533, - 0.2533 to + 0.2533, 0.2533 to 0.8415, 0.8415 to + oo. 

8 . 929.0. 

9. a. Yes. t = 0.88. b. Yes. t = 2.5. c. No. f = 7. 



ANSWERS TO EXERCISES 


491 


X 

Binomial 

Ordinates 

Normal 

Ordinates 

X 

Binomial 

Ordinates 

Normal 

Ordinates 

0 

.000 

.000 

9 

.175 

.176 

1 

.000 

.0005 

10 

.122 

.121 

12 

.002 

.002 

11 

.067 

.065 

3 

.0085 

.009 

12 

.028 

.027 

4 

.028 

.027 

13 

.0085 

.009 

5 

.067 

.065 

14 

.002 

.002 

6 

.122 

.121 

15 

.000 

.0005 

7 

.175 

.176 

16 

.000 

.000 

8 

.196 

.199 





11. 0.9. 12. a. 0.95. b. 0.95. 

13. a. 0.076. b. 0.076. 14. 2.5 inches. 

15. a. 720 men. b. 0.000. c. 0.12. 

16. a. 50 units, b. 36 and 64 units. 

17. a. 900. b. Yes. o- = 28.6. 

19. = 53.3, Q 3 = 66.7, Q = 6.7, M.D. = 8, a 3 = 0, a 4 = 3, 

87th percentile = 71.2. 

20. 8.5 and 18.7. 21. a. 0.24. b. 0.02. c. 0.06. d. 0.007. 


Page 417 

1. Using M = 39.835, adjusted a — 2.0322, - = 0.492078, and rounding 

( T 

the values of t to two decimal places, we have 


X 

Theoretical 

fix) 

Ordinates 

Theoretical 

£L 

X 

Theoretical 

f(x) 

Ordinates 

Theoretical 

fix) 

Areas 

33 

7 

8 

42 

1108 

1110 

34 

32 

34 

43 

582 

592 

35 

116 

123 

44 

240 

252 

36 

329 

339 

45 

78 

81 

37 

737 

746 

46 

20 

21 

38 

1309 

1295 

47 

4 

4 

39 

1805 

1818 

48 

1 

1 

40 

1957 

1929 




41 

1669 

1646 

Total 

9994 

9999 



ANSWERS TO EXERCISES 


492 


2. M = 67.92. 
cr = 2.42. 

- = 0.4132. 


3. M = 6.06. 
<r = 0.20. 

- = 5.00. 


Values of t are rounded to two decimal places. 


4 

Theoretical 

fix) 

4 

Theoretical 

fix) 

57.5 

0.2 

5.45 

4.3 

58.5 

0.6 

5.55 

14.8 

59.5 

2.4 

5.65 

40.4 

60.5 

7.6 

5.75 

86.3 

61.5 

21.4 

5.85 

144.3 

62.5 

47.6 

5.95 

188.9 

63.5 

91.7 

6.05 

193.5 

64.5 

154.1 

6.15 

155.3 

65.5 

207.9 

6.25 

97.6 

66.5 

242.4 

6.35 

47.9 

67.5 

244.8 

6 45 

18.5 

68.5 

199.2 

6.55 

5.5 

69.5 

140.7 

6.65 

1.3 

70.5 

85.6 

6.75 

0.3 

71.5 

41.8 

6.85 

0.0 

72.5 

18.0 



73.5 

6.5 

Total 

998.9 

74.5 

2.0 



75.5 

0.5 



76.5 

0.2 



Total 

1515.2 



924(4) 

i 

„ o /OOO < V 71 \ J./A „,U„ — < 

X - 74.2 

— _ v * — \ooo.vi ajujko) micic — 

11.0668V27T 

11.0668 


CHAPTER 13 

Pages 438-439 

2. 5,625; 22,500. 3. 0.034 pound. 

4. 0.0525 million per cu. mm. 6. a. 0.9255. b. 0.0026. 

6. Yes. 12. Yes. i = 0.62. 13. Yes. t = 2. 



Pages 449-450 

1. a. 


ANSWERS TO EXERCISES 


493 


“1 

1st 

100 

2nd 

100 

3rd 

100 

4th 

100 

5 th 
100 

6th 

100 

7th 

100 

8th 

100 

9th 

100 

10th 

100 

Total 

M' 

117.15 

120.85 

117.35 

122.95 

119.25 

118.45 

118.05 

113.65 

119.45 

120.25 

118.74 

a j 

13.8 

17.4 

17.6 

21.4 

15.5 

17.8 

18.0 

13.4 

17.5 

16.2 

17.2 


b. Mean of means = 118.74. M u — 118.74. 

c. Mean of <r’s = 16.9. c u — 17.2. 

d. Eight. The five per cent limits are 115.37 and 122.11. 

e. Seven. The five per cent limits are 14.8 and 19.6. 

f. Sampling probably went awry on the 4th 100 and the 8th 100, for 
both the mean and the standard deviation are outside the five per 
cent levels. 

2. Difference not due to chance, t = 13+ . 

3. Yes. t = 6+. 

4. Class of 1943 is poorly prepared, t = 4.4. 

Class of 1945 is within 5 per cent level, t = 1.7. 

5. (T Pr p 2 = 0.0078. Difference not significant, t = 0.5+ . 

6 . <r P - Pi = 0.0417. Difference not significant, t = 1.1 + . 

Pages 457-464 

1. Em = 1.57. Even chance that sample mean, 149.8, does not differ 
from M u by more than ± 1.57. 

era- = 1.64. A two to one chance that the sample <j ) 42.47, does not 
differ from c u by more than db 1.64. 

2. a. About 0.58. b. About 0.62. 

3. a. (t m = 0.364, = 0.257 p.b. per min. b. About 0.994. 

4. r = 0.77, <j r = 0.05. A two to one chance that the sample r does not 
differ from the universe r by more than ± 0.05. E r = 0.03. 

6. E m = 0.0137 inch. An even chance that the sample mean, 39.835 
inches, does not differ from the universe mean by more than 0.0137 
inch. 

6 . t = 2.7 and the difference is probably significant. 

7. t = 167, and the difference is certainly significant. In fact, Group I 
was American soldiers and Group II was Japanese soldiers. 

8. For National League, M = 0.283, cr = 0.086. 

For American League, M = 0.278, a — 0.085. 
t = 0.47, and the difference is not significant. 

10 . t = 14.6, and the difference in the means is sufficient to warrant the 
conclusion that Scots are taller than Englishmen. 



494 


ANSWERS TO EXERCISES 


11. Yes. * = 5.8. 

13. t = 1.5. 14. t = 11.7. 



1st Group 

2nd Group 

N 

72,127 

17,986 

M 

$8.37 

$9.59 

(7 

$2.49 

$2.43 


t — 60. Hence (Mi — Mi) is significant. 

16. t = 35.3. 17. 96. 

18. t = 4.15. 19. t = 7.96; t = 0. 

20. t — 2.53. Probably significant. 

21. Yes. t — 16+. 22. Yes. t — 11.6. 

23. a. About 0.73. b. About 44 days. c. N = 25. 

24. Yes. For f = 15+ for heights and 13+ for weights. 

25. Yes. 



INDEX 

The numbers refer to pages. 


Absolute error, 16 
Absolute value of a number, 121 
Accuracy, in measurements, 14; meas- 
urements of, 15 

Aggregative relatives, simple, 179; 

weighted, 185 
Analysis, statistical, 2 
Arithmetic mean, a s a moment, 62; 
calculation of, 60, 62, 73; criticism 
of, 99; defined, 60; of relatives, 181— 
182, 180-190; probable error of, 
143, 432; standard deviation of, 144, 
430; standard error of, 144 
Array, nature of, 24 
Asymmetry, defined, 43, 52; Pearson’s 
measures of, 151, 152; positive and 
negative, 152-153; quartile measure 
of, 156; third moment as measure 
of, 157 

Average, characteristics of a good, 59; 
uses of an, 59; of relatives, 182-184, 
188-200 

Average deviation, 120 

Base, in index number construction, 
175 

Benson, Paul, Preface 
Bernoulli, James, 395 
Bernoulli Theorem, 395 
Bessel, Friedrich Wilhelm, 138, 452 
Bias, downward, 197; in averages of 
relatives, 197; in use of weights, 198; 
type, 197; upward, 197; weight, 197 
Binomial expansion, 367, 383-395 
Binomial, point, arithmetic mean of, 
388; general form of, 384; gradu- 
ation of data by, 391-395; mode of, 
386; skewness and excess of, 389- 
390; standard deviation of, 388 
Birge, R. T., 456 
Bowley, A. L., 5, 156 
Bradstreet’s index number, 180 
Brahe, Tycho, 339 
Bravais, A., 237 
Brinton, W. C., 466 
Brown, W. and Thomson, G. H., 240 
Bruce, C. W., 110 
Burgess, Robert W., 98, 197 


Camp, B. H., 36, 465 
Carver, H. C., Preface, 456 
Central tendency, 52, 98-101; meas- 
ures of, 59-110 
Chaddock, R. E., 269, 465 
Charts, construction of, 37-53 
Class, boundary, 26; frequency, 25; 
interval, 25, 26, 30; limits, 26, 30; 
mark, 27; unit, 70, 71, 72; width, 
25 

Classification of data, 23 
Coefficient of, correlation, 238, 245, 
246, 254; multiple correlation, 281, 
284, 295, 300, 305; regression, 250, 
295; skewness, 151, 152, 156, 157: 
variation, 131 

Column diagram ( see Histogram), con- 
struction of, 37; defined, 37 
Combination, 366 
Compound interest law, 316 
Confidence limits, 410 
Continuous variate, 6 
Coolidge, J. L., 379 
Correlation, by ranks, 263-265; co- 
efficient of, 237, 245, 246, 254, 265; 
definition of, 240; index, 357; mul- 
tiple, 277-305; non-linear, 355; 
partial, 295; perfect, 239, 287; sum- 
mary of, 247; table, 254; versus 
causation, 267 
Craig, A. T., 391 
Craig, C. C., 456 
Cratnorne, A. R., Preface 
Crowder, W. F., 35, 201, 288 
Croxton and Cowden, 453, 465 
Curve fitting, by averages, 313; by 
least squares, 307, 314, 331; by 
moments, 307, 331 ; by selected 
points, 311; of exponential function, 
316; of hyperbola, 333; of modified 
exponential, 334; of modified power 
function, 337; of normal curve, 413- 
417; of parabola, 330-333; of power 
function, 323; of straight line, 216, 
222, 311, 314 

Curves, cumulative, 50; exponential, 
316; hyperbolic, 333; J-shaped. 52: 
mound-shaped, 44, 52; normal, 53, 


495 



496 


INDEX 


134, 363-409; parabolic, 82, 330; 
skewed, 52, 152, 153, 383, 384 
Czuber, Emanuel, 85 

Data, statistical, 1; grouped, 23; un- 
grouped, 23 
Davenport, D. H., 458 
Davies, George R., 35, 201, 288 
Decile, 119 

Degrees of freedom, 453 
Deming, W. E., 456 
De Moivre, Abraham, 363, 396, 397 
Dependent events, 376 
Dependent variable, 6 
Determinants, defined, 289, 290, 292; 
finding mode by, 110; in multiple 
correlation, 293-305 
Deviation, mean, 121; probable, 119; 
quartile, 115; standard, of a differ- 
ence, 444; standard, of a distribu- 
tion, 125-130; standard, of the 
mean, 144, 430; standard, of the 
standard deviation, 144, 443 
Differences, standard error of, 444-445 
Differencing, process of, 307; use of, 
in curve-fitting, 309-310 
Discrete variate, 6, 15 
Dispersion, 43, 111; meaning of, 111- 
115; measures of, 113 
Distribution of means, defined, 143- 
144; excess of, 441; illustrate J, 139, 
140, 426; mean of, 144, 429; prob- 
able error of, 144, 432; standard 
deviation of, 144, 430; standard 
error of, 144; skewness of, 439 
Distribution of standard deviations, de- 
fined, 144,442; mean of, 145; stand- 
ard deviation of, 145, 442; standard 
error of, 144 

Distributions, asymmetrical, 52; cumu- 
lative frequency, 48; J-shaped, 52; 
mound-shaped, 44, 52; normal, 53, 
414-416; simple frequency, 25; 
symmetrical, 52; temporal, 44; U- 
shaped, 52 

Empirical curves, defined, 210, 306; 
limitations of, 338; methods of fit- 
ting, Chapter 10 

Empirical equation, 210, 306, 338 
Empirical probability, 369 
Equation of, exponential functions, 316; 
hyperbola, 333; hyperplane, 298; 
modified exponential, 334; modified 
power function, 337; normal curve, 
119, 396, 400; plane, 278; power 
function, 323; quadratic parabola, 
330; straight line, 210, 311 


Error, absolute, 16; possible, 15; prob- 
able, 137; probable, of mean, 143, 
432; probable, of standard deviation, 
144, 443; relative, 16; standard, of 
estimate, 233, 239, 280, 286, 295, 
300, 304 

Excess (kurtosis), 43, 158, 389, 439 
Expectation, 374 

Exponential function, fitting data to, 
316, 343; when to use with empirical 
data, 317-318 
Ezekiel, Mordecai, 466 

Factor reversal test, 199 
Fisher’s Ideal Index, 198 
Fisher, Irving, 195, 198, 199, 200 
Fisher, R. A., 364, 421, 447, 452, 453, 
465 

Forsyth, C. H., Preface 
Freeman, H. A., 439 
Frequency, class, 25, 422; cumulative, 
48; relative, 372 

Frequency curves, 40, 397; types of, 
52, 418 

Frequency distribution, binomial, 382- 
395; cumulative, 48; normal, 395- 
410, 413-417; simple, 25 
Frequency polygon, 38 
F requency table, 4, 25 
Function, 7 

Gale, A. S., 271 

Gauss, Carl Friedrich, 138, 363, 396 
Gavett, G. I., 271 

Geometric mean, computation of, 90; 
criticism of, 101; defined, 87; of 
relatives, 180-182, 191-193; use of, 
88 

Glover, J. W., Preface, 379, 467 
Goodness of fit, tests for, 211, 213, 230, 
232, 233, 286, 300, 304, 416 
Graduation of a frequency distribution, 
by point binomial, 391-395; by 
normal curve, 413-417 
Graphical representation, 37, 340-350; 
ot cumulative distributions, 50; of 
simple frequency distributions, 37, 
38, 39; of temporal distributions, 
43-48; with logarithmic paper, 346; 
with semi-logarithmic paper, 342 
Growth, law of organic, 316 

Harmonic mean, computation of, 93; 
defined, 92; of relatives, 181, 188; 
uses of, 93, 95-97, 181, 198 
Hall, Winfield S., 171 
Haskell, S. C., 46, 466 
Histogram, 37 



INDEX 


497 


Holzinger, Karl, 465 
Hotelling, Harold, 153 
Huntington, E. V., 5 
Hyperbola, 333 

Independent, events, 375; variable, 6 
Index numbers, 174-202; as average 
of relatives, 180-182, 188-193; bias 
in, 197-198; defined, 174, 177; 

Fisher’s Ideal, 198; purpose of, 174; 
unweighted, 178-184; weighted, 
185-200 

Index of precision, defined, 399; use 
of, 433 

Interpretation of statistical results, 3, 
364, 410, Chapter 13 
Interval, class, 25, 26, 30 

Jackson, Dunham, 220, 456 

Karsten, K. G., 466 

Kendall, M. G., 1, 59, 145, 465 

Kenney, J. F., 465 

Kepler, Johann, 339 

King, W. I., 49, 81, 200 

Kurtosis, 43 

Kurtz, Edwin B., 172 

Laplace, Pierre Simon, 363, 396 
Least squares, principle of, 211; fitting 
a parabola by, 330; fitting a straight 
line by, 214-222, 314 
Lee, Alice, 463 
Levels of significance, 410 
Linear trends, 203 
Lines of regression, 218, 248-249 
Lipka, Joseph, 466 
Logarithmic paper, 346 

May, Mark, 301 

Mean, arithmetic, 62; geometric, 87; 
harmonic, 92 

Mean deviation, computation of, 122; 
defined, 121 

Median, 49, 76; computation of, 51, 
78, 79; defined, 49, 76 
Mill, J. S., 269 

Mills, F. C., 34, 188, 357, 458, 466 
Mitchell, Wesley C., 200 
Modal class, 81 

Mode, 80; approximate, 80, 81, 84, 85; 

criticism of, 100; crude, 80; true, 80 
Modified exponential function, fitting 
data to, 333; when to use with em- 
pirical data, 333 

Modified power function, fitting data 
to, 337 ; when to use with empirical 
data, 337 

Moment, arithmetic mean as, 62-66 


Moments, adjusted, of a distribution, 
163; computation of, 164; method 
of, in curve-fitting, 160; unadjusted, 
of a distribution, 159; of point bi- 
nomial, 387-389; of normal curve, 
405 

Multiple correlation, coefficient of, 281, 
287; defined, 277, 288 

Mutually exclusive events, 374 

Normal curve, defined by equation, 53, 
119, 396; derivation of equation to, 
397; graduation of distribution by, 
413-417; history of, 363, 396; mo- 
ments of, 405; properties of, 401; 
uses of, 134, 397, 405-409 

Normal equations, 215, 217, 278, 283, 
298, 303 

Null hypothesis, 447 

Numerical value of a number, 121 

Ogive, 48 

Organic growth, law of, 316 

Organization of data, 1, 5 


Parent population (universe), 3, 23, 
363,419,451 

Parkes, A. 8., and Drummond, J. C., 


450 

Pearl, Raymond, 105, 301, 466 
Pearson, K. S., 439 

Pearson, Karl, 85, 151, 237, 391, 416, 
419, 456, 463, 467 
Percentiles, 119 
Permutation, 364 
Point binomial, 383-395 
Power function, fitting data to, 323. 
347; when to use with empirical 
data, 324 

Precision, index of, 399, 433 
Preliminary sheet, 25, 253 
Probability, 369; a priori, 372; em- 
pirical, 369; theorems on, 374-378 
Probable deviation, 119 
Probable error, 118, 137, 403; defined, 
137, 403; of any measure, 137, 403; 
of the arithmetic mean, 143, 432; of 
the standard deviation, 146, 443 


Quadratic parabola, fitting data to, 330; 
in finding approximate mode, 83; 
when to use with empirical data, 330 
Quartile deviation, 115 
Quartiles, computation of, 117, 120; 
defined, 115; in measuring disper- 
sion, 118; in measuring skewness, 
156 

Quetelet, Adolphe, 363 



498 


INDEX 


Range, 24, 113, 114 
Regression, coefficients of, 250, 295; 
of Y on X, 250; of X on Y, 250; 
multiple, 295; plane, 278; hyper- 
plane, 299 

Relatives, defined, 174; chain, 176; 
fixed base, 176; link, 176; simple 
aggregative, 179; simple arithmetic 
mean of, 181; simple geometric 
mean of, 181 ; simple harmonic mean 
of, 181; weighted aggregative, 185; 
weighted arithmetic mean of, 188; 
weighted geometric mean of, 191; 
weighted harmonic mean of, 188 
Relative error, defined, 16; in a prod- 
uct, 19; in a quotient, 20 
Relative frequency, 369, 424 
Relative variability, 113, 118, 122, 131 
Reliability, defined, 143, 421 ; of a 
difference, 444-445; of the mean, 
143, 453; of the standard deviation, 

145, 453 

Repeated trials, theorem of, 378 
Residual, 210 
Riebesell, Paul, 42 

Rietz, H. L., 164, 416, 424, 456, 465, 466 
Robinson, George, 168 
Rounding off numbers, 16 
Running, T. R., 310, 466 

Sample, 3, 23, 363, 419; small, 450 

Scarborough, J. B., 18, 466 

Scatter diagram, 234, 253 

Secrist, Horace, 53, 250 

Secular trend, 226 

Selection ( see Combination), 366 

Semi-logarithmic paper, 342 

Sheppard’s Corrections, 163-167 

Shewhart, W. A., 456 

Significant difference, 409, 443-448 

Significant figures, 15 

Simple frequency distribution, 25 

Skewness, 150-157; defined, 43, 150; 

measurement of, 151-157 
Slope of a straight line, 205 
Snedecor, G. W., 466 
Solomons, Leonard M., 153 
Sorenson, Herbert, 35, 296, 438, 466 
Standard deviation, computation of, 
126, 127, 129, 130; defined, 125; in 
class frequencies, 125, 423; of the 
mean, 144, 430, 453; of a percentage, 
425; of the standard deviation, 144- 

146, 443, 453 

Standard error, of estimate, 233, 280; 
of the mean, 144; of the standard 
deviation, 144 


Standard unit, 162, 250, 400 
Statistical, analysis, 2; constant, 3; 
data, 1; induction, 364, 420; in- 
ference, 364; methods, 1 
Stirling’s Formula, 379 
Straight line, fitting observed data to, 
210, 311-315; intercepts of, 207; 
properties of, 206: slope of, 205 
Summation, defined, 7; limits of, 8; 

theorems on, 9 
Surface, F. M., 105, 301 
Symmetrical distributions, 52 

Tabular presentation, 23, 25 
Tabulation of data, 23, 53 
Tallying, 25, 253 
Temporal distribution, 44, 226 
Tests of significance, 409, 446 
Thurstone, L. L., 34, 411 
Time reversal test, 197 
Time series, 43; fitting a straight line 
to, 226, 342 

Tippett, L. H. C., 449, 465 
Treloar, Alan E., 364, 462, 465 
Trend, linear, 203; non-linear, 306 
True class limits, 26 
Tycho Brahe, 339 
Tyler, R. W., Preface 

Unit, class, 70, 71, 72, 128, 161, 245; 

standard, 162 
Universe, 3, 23, 363, 419 
Unweighted index numbers, 179-184 

Variability, absolute, 113; relative, 
118, 122, 131-133 

Variable, dependent, 6; independent, 6 
Variance, 126, 442 

Variates, 6; continuous, 6; discrete, 6 
Variation, coefficient of, 131 

Walker, H. M., 1 
Walsh, C. M, 200 
Watkeys, C. W., 271 
Waugh, Albert E., 412, 465 
Weighted aggregative relative, 185 
Weighted averages, 188-193 
Weighted index numbers, 188, 191, 
195-199 

Weighted mean, 67, 90 
Wembridge, H. A., 148 
White, R. C., 34 
Whittaker, E. T., 168 
Winfrey, Robley, 172 
Wolfenden, H. H., 466 

Yoder, Dale, 35 

Yule, G. U., 1, 59, 145, 465 






